Pyexcel-io Plugin guide¶
There has been a lot of plugins for reading and writing a file types. Here is a guide for you to choose them.
Package name | Supported file formats | Dependencies | Python versions |
---|---|---|---|
pyexcel-io | csv, csvz [1], tsv, tsvz [2] | 2.6, 2.7, 3.3, 3.4, 3.5, 3.6 pypy | |
pyexcel-xls | xls, xlsx(read only), xlsm(read only) | xlrd, xlwt | same as above |
pyexcel-xlsx | xlsx | openpyxl | same as above |
pyexcel-ods3 | ods | pyexcel-ezodf, lxml | 2.6, 2.7, 3.3, 3.4 3.5, 3.6 |
pyexcel-ods | ods | odfpy | same as above |
Package name | Supported file formats | Dependencies | Python versions |
---|---|---|---|
pyexcel-xlsxw | xlsx(write only) | XlsxWriter | Python 2 and 3 |
pyexcel-xlsxr | xlsx(read only) | lxml | same as above |
pyexcel-odsr | read only for ods, fods | lxml | same as above |
pyexcel-htmlr | html(read only) | lxml,html5lib | same as above |
In order to manage the list of plugins installed, you need to use pip to add or remove a plugin. When you use virtualenv, you can have different plugins per virtual environment. In the situation where you have multiple plugins that does the same thing in your environment, you need to tell pyexcel which plugin to use per function call. For example, pyexcel-ods and pyexcel-odsr, and you want to get_array to use pyexcel-odsr. You need to append get_array(…, library=’pyexcel-odsr’).
Footnotes
[1] | zipped csv file |
[2] | zipped tsv file |
Read and write with performance¶
Partial reading¶
csv, tsv by pyexcel-io, ods by pyexcel-odsr, html by pyexcel-htmlr are implemented in partial read mode. If you only need first half of the file, the second half of the data will not be read into the memory if and only if you use igetters(iget_records, iget_array) and isaveer(isave_as and isave_book_as).
Read on demand¶
xls by pyexcel-xls promised to read sheet on demand. It means if you need only one sheet from a multi-sheet book, the rest of the sheets in the book will not be read.
Streaming write¶
csv, tsv by `pyexce-io`_ can do streaming write.
Write with constant memory¶
xlsx by pyexcel-xlsxw can write big data with constant memory consumption.