@@ -357,95 +357,6 @@ warn_bad_lines : boolean, default ``True``
357357 If error_bad_lines is ``False ``, and warn_bad_lines is ``True ``, a warning for
358358 each "bad line" will be output (only valid with C parser).
359359
360- .. ipython :: python
361- :suppress:
362-
363- f = open (' foo.csv' ,' w' )
364- f.write(' date,A,B,C\n 20090101,a,1,2\n 20090102,b,3,4\n 20090103,c,4,5' )
365- f.close()
366-
367- Consider a typical CSV file containing, in this case, some time series data:
368-
369- .. ipython :: python
370-
371- print (open (' foo.csv' ).read())
372-
373- The default for `read_csv ` is to create a DataFrame with simple numbered rows:
374-
375- .. ipython :: python
376-
377- pd.read_csv(' foo.csv' )
378-
379- In the case of indexed data, you can pass the column number or column name you
380- wish to use as the index:
381-
382- .. ipython :: python
383-
384- pd.read_csv(' foo.csv' , index_col = 0 )
385-
386- .. ipython :: python
387-
388- pd.read_csv(' foo.csv' , index_col = ' date' )
389-
390- You can also use a list of columns to create a hierarchical index:
391-
392- .. ipython :: python
393-
394- pd.read_csv(' foo.csv' , index_col = [0 , ' A' ])
395-
396- .. _io.dialect :
397-
398- The ``dialect `` keyword gives greater flexibility in specifying the file format.
399- By default it uses the Excel dialect but you can specify either the dialect name
400- or a :class: `python:csv.Dialect ` instance.
401-
402- .. ipython :: python
403- :suppress:
404-
405- data = (' label1,label2,label3\n '
406- ' index1,"a,c,e\n '
407- ' index2,b,d,f' )
408-
409- Suppose you had data with unenclosed quotes:
410-
411- .. ipython :: python
412-
413- print (data)
414-
415- By default, ``read_csv `` uses the Excel dialect and treats the double quote as
416- the quote character, which causes it to fail when it finds a newline before it
417- finds the closing double quote.
418-
419- We can get around this using ``dialect ``
420-
421- .. ipython :: python
422- :okwarning:
423-
424- dia = csv.excel()
425- dia.quoting = csv.QUOTE_NONE
426- pd.read_csv(StringIO(data), dialect = dia)
427-
428- All of the dialect options can be specified separately by keyword arguments:
429-
430- .. ipython :: python
431-
432- data = ' a,b,c~1,2,3~4,5,6'
433- pd.read_csv(StringIO(data), lineterminator = ' ~' )
434-
435- Another common dialect option is ``skipinitialspace ``, to skip any whitespace
436- after a delimiter:
437-
438- .. ipython :: python
439-
440- data = ' a, b, c\n 1, 2, 3\n 4, 5, 6'
441- print (data)
442- pd.read_csv(StringIO(data), skipinitialspace = True )
443-
444- The parsers make every attempt to "do the right thing" and not be very
445- fragile. Type inference is a pretty big deal. So if a column can be coerced to
446- integer dtype without altering the contents, it will do so. Any non-numeric
447- columns will come through as object dtype as with the rest of pandas objects.
448-
449360.. _io.dtypes :
450361
451362Specifying column data types
@@ -1239,6 +1150,62 @@ data that appear in some lines but not others:
12391150 1 4 5 6
12401151 2 8 9 10
12411152
1153+ .. _io.dialect :
1154+
1155+ Dialect
1156+ '''''''
1157+
1158+ The ``dialect `` keyword gives greater flexibility in specifying the file format.
1159+ By default it uses the Excel dialect but you can specify either the dialect name
1160+ or a :class: `python:csv.Dialect ` instance.
1161+
1162+ .. ipython :: python
1163+ :suppress:
1164+
1165+ data = (' label1,label2,label3\n '
1166+ ' index1,"a,c,e\n '
1167+ ' index2,b,d,f' )
1168+
1169+ Suppose you had data with unenclosed quotes:
1170+
1171+ .. ipython :: python
1172+
1173+ print (data)
1174+
1175+ By default, ``read_csv `` uses the Excel dialect and treats the double quote as
1176+ the quote character, which causes it to fail when it finds a newline before it
1177+ finds the closing double quote.
1178+
1179+ We can get around this using ``dialect ``
1180+
1181+ .. ipython :: python
1182+ :okwarning:
1183+
1184+ dia = csv.excel()
1185+ dia.quoting = csv.QUOTE_NONE
1186+ pd.read_csv(StringIO(data), dialect = dia)
1187+
1188+ All of the dialect options can be specified separately by keyword arguments:
1189+
1190+ .. ipython :: python
1191+
1192+ data = ' a,b,c~1,2,3~4,5,6'
1193+ pd.read_csv(StringIO(data), lineterminator = ' ~' )
1194+
1195+ Another common dialect option is ``skipinitialspace ``, to skip any whitespace
1196+ after a delimiter:
1197+
1198+ .. ipython :: python
1199+
1200+ data = ' a, b, c\n 1, 2, 3\n 4, 5, 6'
1201+ print (data)
1202+ pd.read_csv(StringIO(data), skipinitialspace = True )
1203+
1204+ The parsers make every attempt to "do the right thing" and not be very
1205+ fragile. Type inference is a pretty big deal. So if a column can be coerced to
1206+ integer dtype without altering the contents, it will do so. Any non-numeric
1207+ columns will come through as object dtype as with the rest of pandas objects.
1208+
12421209.. _io.quoting :
12431210
12441211Quoting and Escape Characters
0 commit comments