pyexcel.iget_array

pyexcel.iget_array(**keywords)[source]

Obtain a generator of an two dimensional array from an excel source

It is similiar to pyexcel.get_array() but it has less memory footprint.

Not all parameters are needed. Here is a table

source	parameters
loading from file	file_name, sheet_name, keywords
loading from string	file_content, file_type, sheet_name, keywords
loading from stream	file_stream, file_type, sheet_name, keywords
loading from sql	session, table
loading from sql in django	model
loading from query sets	any query sets(sqlalchemy or django)
loading from dictionary	adict, with_keys
loading from records	records
loading from array	array
loading from an url	url

Parameters

file_name :

a file with supported file extension

file_content :

the file content

file_stream :

the file stream

file_type :

the file type in file_content or file_stream

session :

database session

table :

database table

model:

a django model

adict:

a dictionary of one dimensional arrays

url :

a download http url for your excel file

with_keys :

load with previous dictionary’s keys, default is True

records :

a list of dictionaries that have the same keys

array :

a two dimensional array, a list of lists

sheet_name :

sheet name. if sheet_name is not given, the default sheet at index 0 is loaded

start_rowint

defaults to 0. It allows you to skip rows at the begginning

row_limit: int

defaults to -1, meaning till the end of the whole sheet. It allows you to skip the tailing rows.

start_columnint

defaults to 0. It allows you to skip columns on your left hand side

column_limit: int

defaults to -1, meaning till the end of the columns. It allows you to skip the tailing columns.

skip_row_func:

It allows you to write your own row skipping functions.

The protocol is to return pyexcel_io.constants.SKIP_DATA if skipping data, pyexcel_io.constants.TAKE_DATA to read data, pyexcel_io.constants.STOP_ITERATION to exit the reading procedure

skip_column_func:

It allows you to write your own column skipping functions.

The protocol is to return pyexcel_io.constants.SKIP_DATA if skipping data, pyexcel_io.constants.TAKE_DATA to read data, pyexcel_io.constants.STOP_ITERATION to exit the reading procedure

skip_empty_rows: bool

Defaults to False. Toggle it to True if the rest of empty rows are useless, but it does affect the number of rows.

row_renderer:

You could choose to write a custom row renderer when the data is being read.

auto_detect_float :

defaults to True

auto_detect_int :

defaults to True

auto_detect_datetime :

defaults to True

ignore_infinity :

defaults to True

library :

choose a specific pyexcel-io plugin for reading

source_library :

choose a specific data source plugin for reading

parser_library :

choose a pyexcel parser plugin for reading

skip_hidden_sheets:

default is True. Please toggle it to read hidden sheets

Parameters related to csv file format

for csv, fmtparams are accepted

delimiter :

field separator

lineterminator :

line terminator

encoding:

csv specific. Specify the file encoding the csv file. For example: encoding=’latin1’. Especially, encoding=’utf-8-sig’ would add utf 8 bom header if used in renderer, or would parse a csv with utf brom header used in parser.

escapechar :

A one-character string used by the writer to escape the delimiter if quoting is set to QUOTE_NONE and the quotechar if doublequote is False.

quotechar :

A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters. It defaults to ‘”’

quoting :

Controls when quotes should be generated by the writer and recognised by the reader. It can take on any of the QUOTE_* constants (see section Module Contents) and defaults to QUOTE_MINIMAL.

skipinitialspace :

When True, whitespace immediately following the delimiter is ignored. The default is False.

pep_0515_off :

When True in python version 3.6, PEP-0515 is turned on. The default is False

Parameters related to xls file format:

Please note the following parameters apply to pyexcel-xls. more details can be found in xlrd.open_workbook()

logfile:

An open file to which messages and diagnostics are written.

verbosity:

Increases the volume of trace material written to the logfile.

use_mmap:

Whether to use the mmap module is determined heuristically. Use this arg to override the result.

Current heuristic: mmap is used if it exists.

encoding_override:

Used to overcome missing or bad codepage information in older-version files.

formatting_info:

The default is False, which saves memory.

When True, formatting information will be read from the spreadsheet file. This provides all cells, including empty and blank cells. Formatting information is available for each cell.

ragged_rows:

The default of False means all rows are padded out with empty cells so that all rows have the same size as found in ncols.

True means that there are no empty cells at the ends of rows. This can result in substantial memory savings if rows are of widely varying sizes. See also the row_len() method.

When you use this function to work on physical files, this function will leave its file handle open. When you finish the operation on its data, you need to call pyexcel.free_resources() to close file hande(s).

for csv, csvz file formats, file handles will be left open. for xls, ods file formats, the file is read all into memory and is close afterwards. for xlsx, file handles will be left open in python 2.7 - 3.5 by pyexcel-xlsx(openpyxl). In other words, pyexcel-xls, pyexcel-ods, pyexcel-ods3 won’t leak file handles.