pyexcel - Let you focus on data, instead of file formats

Author:C.W.
Source code:http://github.com/chfw/pyexcel
Issues:http://github.com/chfw/pyexcel/issues
License:GPL v3
Version:0.1.1

Introduction

pyexcel is a wrapper library to read, manipulate and write data in different excel formats. This library makes information processing involving excel files an enjoyable task. The data in excel files can be turned into array or dict with least code, vice versa. And ready-made custom filters and formatters can be applied. However, this library is not made to support fonts, colors and charts.

It was created due to the lack of uniform programming interface to access data in different excel formats. A developer needs to use different methods of different libraries to read the same data in different excel formats, hence the resulting code is cluttered and unmaintainable.

In addition, the library recognizes that Excel files are de-facto file format for information sharing in non-software centric organisations. Excel files are not only used for mathematical computation in financial institutions but also used for many other purposes in an office work environment.

All great work have done by individual library developers. This library unites only the data access code. With that said, pyexcel also bring something new on the table. “csvz” and “tsvz” format, new format names as of 2014, are zipped csv or tsv files and supported by pyexcel.

Getting the source

Source code is hosted in github. You can get it using git client:

$ git clone http://github.com/chfw/pyexcel.git

Usage

Here are the example usages:

>>> import pyexcel as pe
>>> import pyexcel.ext.xls # import it to be able handle xls file
>>> import pyexcel.ext.xlsx # xlsx file
>>> sheet = pe.load("your_file.xls")
>>> sheet # ascii representation of the content
Sheet Name: Sheet 1
+----------+----------+----------+
| 1        | 2        | 3        |
+----------+----------+----------+
| Column 1 | Column 2 | Column 3 |
+----------+----------+----------+
| 4        | 5        | 6        |
+----------+----------+----------+
>>> sheet["A1"]
1.0
>>> # format a row using a lambda function
>>> sheet.row.format(1, str, lambda value: str(value))
>>> sheet.column[0]
[1.0, 'Column 1', 4.0]
>>> sheet.row[2]
[4.0, 5.0, 6.0]
>>> sheet.name_columns_by_row(1)
>>> sheet.column["Column 1"]
[1.0, 4.0]
>>> sheet.save_as("myfile.csv")
>>> # load the whole excel file
>>> book = pe.load_book("your_file.xls")
>>> book
Sheet Name: Sheet 1
+----------+----------+----------+
| 1        | 2        | 3        |
+----------+----------+----------+
| Column 1 | Column 2 | Column 3 |
+----------+----------+----------+
| 4        | 5        | 6        |
+----------+----------+----------+
Sheet Name: Sheet 2
+---+---+---+-------+
| a | b | c | Row 1 |
+---+---+---+-------+
| e | f | g | Row 2 |
+---+---+---+-------+
| 1 | 2 | 3 | Row 3 |
+---+---+---+-------+
>>> # alternative access to the same cell on sheet 1
>>> print(book["Sheet 1"][0,0])
1.0
>>> book["Sheet 2"].name_rows_by_column(3)
>>> book["Sheet 2"].row["Row 3"]
[1.0, 2.0, 3.0]
>>> book.save_as("new_file.xlsx") # save a copy

Installation

You can install it via pip:

$ pip install pyexcel

For individual excel file formats, please install them as you wish:

Plugins Supported file formats Dependencies Python versions Comments
pyexcel csv, csvz [1], tsv, tsvz [2] pyexcel-io 2.6, 2.7, 3.3, 3.4, pypy  
pyexcel-xls xls, xlsx(read only), xlsm(read only) xlrd, xlwt 2.6, 2.7, 3.3, 3.4, pypy only support writing xls
pyexcel-xlsx xlsx, openpyxl 2.6, 2.7, 3.3, 3.4, pypy  
pyexcel-ods ods (python 2.6, 2.7) odfpy 2.6, 2.7  
pyexcel-ods3 ods (python 2.7, 3.3, 3.4) ezodf, lxml 3.3, 3.4  
pyexcel-text json, rst, mediawiki,latex, grid, pipe, orgtbl, plain simple tabulate 2.6, 2.7, 3.3, 3.4, pypy only support writing to files

Please import them before you start to access the desired file formats:

from pyexcel.ext import extension_name

or:

import pyexcel.ext.extension_name
Plugin compatibility table
pyexcel pyexcel-io pyexcel-xls pyexcel-xlsx pyexcel-ods pyexcel-ods3 pyexcel-text
v0.1.1 0.0.2 0.0.3 0.0.2 0.0.4 0.0.5 0.0.2
v0.0.10 0.0.2 0.0.3 0.0.2 0.0.4 0.0.5 0.0.2
v0.0.9 0.0.1 0.0.2 0.0.1 0.0.3 0.0.4 0.0.2
v0.0.8 n/a 0.0.1 n/a 0.0.2 0.0.2+ 0.0.1
v0.0.7   n/a   0.0.2 0.0.2 n/a
v0.0.6       0.0.2 0.0.2  

More usage examples

Real world cases

Indices and tables

Footnotes

[1]zipped csv file
[2]zipped tsv file