man/man1/tcs.1 - plan9 - Git at Google

 .TH TCS 1
 .SH NAME
 tcs \- translate character sets
 .SH SYNOPSIS
 .B tcs
 [
 .B -slcv
 ]
 [
 .B -f
 .I ics
 ]
 [
 .B -t
 .I ocs
 ]
 [
 .I file ...
 ]
 .SH DESCRIPTION
 .I Tcs
 interprets the named
 .I file(s)
 (standard input default) as a stream of characters from the
 .I ics
 character set or format, converts them to runes,
 and then converts them into a stream of characters from the
 .I ocs
 character set or format on the standard output.
 The default value for
 .I ics
 and
 .I ocs
 is
 .BR utf ,
 the
 .SM UTF
 encoding described in
 .IR utf (7).
 The
 .B -l
 option lists the character sets known to
 .IR tcs .
 Processing continues in the face of conversion errors (the
 .B -s
 option prevents reporting of these errors).
 The
 .B -c
 option forces the output to contain only correctly converted characters;
 otherwise,
 .B 0x80
 characters will be substituted for
 .SM UTF
 encoding errors and
 .B 0xFFFD
 characters will substituted for unknown characters.
 .PP
 The
 .B -v
 option generates various diagnostic and summary information on standard error,
 or makes the
 .B -l
 output more verbose.
 .PP
 .I Tcs
 recognizes an ever changing list of character sets.
 In particular, it supports a variety of Russian and Japanese encodings.
 Some of the supported encodings are
 .TF jis-kanji
 .TP
 .B utf
 The Plan 9
 .SM UTF
 encoding, known by ISO as UTF-8
 .TP
 .B utf1
 The deprecated original
 .SM UTF
 encoding from ISO 10646
 .TP
 .B ascii
 7-bit ASCII
 .TP
 .B 8859-1
 Latin-1 (Central European)
 .TP
 .B 8859-2
 Latin-2 (Czech .. Slovak)
 .TP
 .B 8859-3
 Latin-3 (Dutch .. Turkish)
 .TP
 .B 8859-4
 Latin-4 (Scandinavian)
 .TP
 .B 8859-5
 Part 5 (Cyrillic)
 .TP
 .B 8859-6
 Part 6 (Arabic)
 .TP
 .B 8859-7
 Part 7 (Greek)
 .TP
 .B 8859-8
 Part 8 (Hebrew)
 .TP
 .B 8859-9
 Latin-5 (Finnish .. Portuguese)
 .TP
 .B koi8
 KOI-8 (GOST 19769-74)
 .TP
 .B jis-kanji
 ISO 2022-JP
 .TP
 .B ujis
 EUC-JX: JIS 0208
 .TP
 .B ms-kanji
 Microsoft, or Shift-JIS
 .TP
 .B jis
 (from only) guesses between ISO 2022-JP, EUC or Shift-Jis
 .TP
 .B gb
 Chinese national standard (GB2312-80)
 .TP
 .B big5
 Big 5 (HKU version)
 .TP
 .B unicode
 Unicode Standard 1.0
 .TP
 .B tis
 Thai character set plus
 .SM ASCII
 (TIS 620-1986)
 .TP
 .B msdos
 IBM PC: CP 437
 .TP
 .B atari
 Atari-ST character set
 .SH EXAMPLES
 .TP
 .B tcs -f 8859-1
 Convert 8859-1 (Latin-1) characters into
 .SM UTF
 format.
 .TP
 .B tcs -s -f jis
 Convert characters encoded in one of several shift JIS encodings into
 .SM UTF
 format.
 Unknown Kanji will be converted into
 .B 0xFFFD
 characters.
 .TP
 .B tcs -lv
 Print an up to date list of the supported character sets.
 .SH SOURCE
 .B \*9/src/cmd/tcs
 .SH SEE ALSO
 .IR ascii (1),
 .IR rune (3),
 .IR utf (7).
	.TH TCS 1
	.SH NAME
	tcs \- translate character sets
	.SH SYNOPSIS
	.B tcs
	[
	.B -slcv
	]
	[
	.B -f
	.I ics
	]
	[
	.B -t
	.I ocs
	]
	[
	.I file ...
	]
	.SH DESCRIPTION
	.I Tcs
	interprets the named
	.I file(s)
	(standard input default) as a stream of characters from the
	.I ics
	character set or format, converts them to runes,
	and then converts them into a stream of characters from the
	.I ocs
	character set or format on the standard output.
	The default value for
	.I ics
	and
	.I ocs
	is
	.BR utf ,
	the
	.SM UTF
	encoding described in
	.IR utf (7).
	The
	.B -l
	option lists the character sets known to
	.IR tcs .
	Processing continues in the face of conversion errors (the
	.B -s
	option prevents reporting of these errors).
	The
	.B -c
	option forces the output to contain only correctly converted characters;
	otherwise,
	.B 0x80
	characters will be substituted for
	.SM UTF
	encoding errors and
	.B 0xFFFD
	characters will substituted for unknown characters.
	.PP
	The
	.B -v
	option generates various diagnostic and summary information on standard error,
	or makes the
	.B -l
	output more verbose.
	.PP
	.I Tcs
	recognizes an ever changing list of character sets.
	In particular, it supports a variety of Russian and Japanese encodings.
	Some of the supported encodings are
	.TF jis-kanji
	.TP
	.B utf
	The Plan 9
	.SM UTF
	encoding, known by ISO as UTF-8
	.TP
	.B utf1
	The deprecated original
	.SM UTF
	encoding from ISO 10646
	.TP
	.B ascii
	7-bit ASCII
	.TP
	.B 8859-1
	Latin-1 (Central European)
	.TP
	.B 8859-2
	Latin-2 (Czech .. Slovak)
	.TP
	.B 8859-3
	Latin-3 (Dutch .. Turkish)
	.TP
	.B 8859-4
	Latin-4 (Scandinavian)
	.TP
	.B 8859-5
	Part 5 (Cyrillic)
	.TP
	.B 8859-6
	Part 6 (Arabic)
	.TP
	.B 8859-7
	Part 7 (Greek)
	.TP
	.B 8859-8
	Part 8 (Hebrew)
	.TP
	.B 8859-9
	Latin-5 (Finnish .. Portuguese)
	.TP
	.B koi8
	KOI-8 (GOST 19769-74)
	.TP
	.B jis-kanji
	ISO 2022-JP
	.TP
	.B ujis
	EUC-JX: JIS 0208
	.TP
	.B ms-kanji
	Microsoft, or Shift-JIS
	.TP
	.B jis
	(from only) guesses between ISO 2022-JP, EUC or Shift-Jis
	.TP
	.B gb
	Chinese national standard (GB2312-80)
	.TP
	.B big5
	Big 5 (HKU version)
	.TP
	.B unicode
	Unicode Standard 1.0
	.TP
	.B tis
	Thai character set plus
	.SM ASCII
	(TIS 620-1986)
	.TP
	.B msdos
	IBM PC: CP 437
	.TP
	.B atari
	Atari-ST character set
	.SH EXAMPLES
	.TP
	.B tcs -f 8859-1
	Convert 8859-1 (Latin-1) characters into
	.SM UTF
	format.
	.TP
	.B tcs -s -f jis
	Convert characters encoded in one of several shift JIS encodings into
	.SM UTF
	format.
	Unknown Kanji will be converted into
	.B 0xFFFD
	characters.
	.TP
	.B tcs -lv
	Print an up to date list of the supported character sets.
	.SH SOURCE
	.B \*9/src/cmd/tcs
	.SH SEE ALSO
	.IR ascii (1),
	.IR rune (3),
	.IR utf (7).