|
|
json2tsv.1 - json2tsv - JSON to TSV converter |
|
|
 |
git clone git://git.codemadness.org/json2tsv (git://git.codemadness.org) |
|
|
 |
Log |
|
|
 |
Files |
|
|
 |
Refs |
|
|
 |
README |
|
|
 |
LICENSE |
|
|
|
--- |
|
|
|
json2tsv.1 (3531B) |
|
|
|
--- |
|
|
|
1 .Dd April 17, 2023 |
|
|
|
2 .Dt JSON2TSV 1 |
|
|
|
3 .Os |
|
|
|
4 .Sh NAME |
|
|
|
5 .Nm json2tsv |
|
|
|
6 .Nd convert JSON to TSV or separated output |
|
|
|
7 .Sh SYNOPSIS |
|
|
|
8 .Nm |
|
|
|
9 .Op Fl n |
|
|
|
10 .Op Fl r |
|
|
|
11 .Op Fl u |
|
|
|
12 .Op Fl F Ar fs |
|
|
|
13 .Op Fl R Ar rs |
|
|
|
14 .Sh DESCRIPTION |
|
|
|
15 .Nm |
|
|
|
16 reads JSON data from stdin. |
|
|
|
17 It outputs each JSON type to a TAB-Separated Value format per line. |
|
|
|
18 .Pp |
|
|
|
19 The options are as follows: |
|
|
|
20 .Bl -tag -width Ds |
|
|
|
21 .It Fl n |
|
|
|
22 Show the indices for array types (by default off). |
|
|
|
23 .It Fl r |
|
|
|
24 Show all control-characters (by default off). |
|
|
|
25 .It Fl u |
|
|
|
26 Unbuffered: flush output after printing each value (by default off). |
|
|
|
27 .It Fl F Ar fs |
|
|
|
28 Use |
|
|
|
29 .Ar fs |
|
|
|
30 as the field separator. |
|
|
|
31 The default is a TAB character. |
|
|
|
32 .It Fl R Ar rs |
|
|
|
33 Use |
|
|
|
34 .Ar rs |
|
|
|
35 as the record separator. |
|
|
|
36 The default is a newline character. |
|
|
|
37 .El |
|
|
|
38 .Sh SEPARATOR CHARACTERS |
|
|
|
39 The |
|
|
|
40 .Ar fs |
|
|
|
41 or |
|
|
|
42 .Ar rs |
|
|
|
43 separators can be specified in the following formats: |
|
|
|
44 .Pp |
|
|
|
45 .Bl -item -compact |
|
|
|
46 .It |
|
|
|
47 \e\e for a backslash character. |
|
|
|
48 .It |
|
|
|
49 \en for a newline character. |
|
|
|
50 .It |
|
|
|
51 \er for a carriage return character. |
|
|
|
52 .It |
|
|
|
53 \et for a TAB character. |
|
|
|
54 .It |
|
|
|
55 \exXX for a character specified in the hexadecimal format as XX. |
|
|
|
56 .It |
|
|
|
57 \eNNN for a character specified in the octal format as NNN. |
|
|
|
58 .El |
|
|
|
59 .Pp |
|
|
|
60 Otherwise: if a single character is specified this character will be used. |
|
|
|
61 If more than one character is specified it will be parsed as a number using the |
|
|
|
62 format supported by |
|
|
|
63 .Xr strtol 3 |
|
|
|
64 with base set to 0 and this character is the index in the ASCII table. |
|
|
|
65 .Sh OUTPUT FORMAT |
|
|
|
66 The output format per node is: |
|
|
|
67 .Bd -literal |
|
|
|
68 nodename<FIELD SEPARATOR>type<FIELD SEPARATOR>value<RECORD SEPARATOR> |
|
|
|
69 .Ed |
|
|
|
70 .Pp |
|
|
|
71 Control-characters such as a newline, TAB and backslash (\en, \et and \e\e) are |
|
|
|
72 escaped in the nodename and value fields unless a |
|
|
|
73 .Fl F |
|
|
|
74 or |
|
|
|
75 .Fl R |
|
|
|
76 option is specified. |
|
|
|
77 .Pp |
|
|
|
78 When the |
|
|
|
79 .Fl F |
|
|
|
80 or |
|
|
|
81 .Fl R |
|
|
|
82 option is specified then the separator characters are removed from the output. |
|
|
|
83 TABs or newlines are printed unless they are set as a separator. |
|
|
|
84 Other control-characters are removed, unless the option |
|
|
|
85 .Fl r |
|
|
|
86 is set. |
|
|
|
87 .Pp |
|
|
|
88 The type field is a single byte and can be: |
|
|
|
89 .Pp |
|
|
|
90 .Bl -item -compact |
|
|
|
91 .It |
|
|
|
92 a for array |
|
|
|
93 .It |
|
|
|
94 b for bool |
|
|
|
95 .It |
|
|
|
96 n for number |
|
|
|
97 .It |
|
|
|
98 o for object |
|
|
|
99 .It |
|
|
|
100 s for string |
|
|
|
101 .It |
|
|
|
102 ? for null |
|
|
|
103 .El |
|
|
|
104 .Sh EXIT STATUS |
|
|
|
105 .Nm |
|
|
|
106 exits with the exit status 0 on success, 1 on a parse error, 2 when out of |
|
|
|
107 memory or a read/write error or 3 with an usage error. |
|
|
|
108 .Sh EXAMPLES |
|
|
|
109 .Bd -literal |
|
|
|
110 json2tsv < input.json | awk -F '\et' '$1 == ".url" { print $3 }' |
|
|
|
111 .Ed |
|
|
|
112 .Pp |
|
|
|
113 To filter without having to unescape characters the |
|
|
|
114 .Fl F |
|
|
|
115 and |
|
|
|
116 .Fl R |
|
|
|
117 options can be used. |
|
|
|
118 In the example below it uses the ASCII character 0x1f (Unit Separator) as the |
|
|
|
119 field separator and the ASCII character 0x1e (Record Separator) as the record |
|
|
|
120 separator. |
|
|
|
121 Additionally the |
|
|
|
122 .Fl r |
|
|
|
123 option is used so control-characters are printed. |
|
|
|
124 .Bd -literal |
|
|
|
125 json2tsv -r -F '\ex1f' -R '\ex1e' < input.json | \e |
|
|
|
126 awk ' |
|
|
|
127 BEGIN { |
|
|
|
128 FS = "\ex1f"; RS = "\ex1e"; |
|
|
|
129 } |
|
|
|
130 $1 == ".url" { |
|
|
|
131 print $3; |
|
|
|
132 }' |
|
|
|
133 .Ed |
|
|
|
134 .Pp |
|
|
|
135 The example can be simplified using the convenience wrapper shellscript |
|
|
|
136 .Xr jaq 1 |
|
|
|
137 .Bd -literal |
|
|
|
138 jaq '$1 == ".url" { print $3 }' < input.json |
|
|
|
139 .Ed |
|
|
|
140 .Sh SEE ALSO |
|
|
|
141 .Xr awk 1 , |
|
|
|
142 .Xr jaq 1 |
|
|
|
143 .Sh AUTHORS |
|
|
|
144 .An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org |
|
|
|
145 .Sh CAVEATS |
|
|
|
146 .Bl -item |
|
|
|
147 .It |
|
|
|
148 Characters in object keys such as a dot or brackets are not escaped in the TSV |
|
|
|
149 output, this can change the meaning of the nodename field. |
|
|
|
150 .It |
|
|
|
151 The JSON parser handles all valid JSON. |
|
|
|
152 It also allows some invalid JSON extensions: it does not do a complete |
|
|
|
153 validation on numbers and is not strict with handling unicode input. |
|
|
|
154 See also RFC 8259 section 9. Parsers. |
|
|
|
155 .It |
|
|
|
156 The maximum depth of objects or arrays is hard-coded to 64 levels deep. |
|
|
|
157 .El |
|