hub / github.com/dask/dask / read_sql_table

Function read_sql_table

dask/dataframe/io/sql.py:195–350 · view source on GitHub ↗

Read SQL database table into a DataFrame. If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the first few rows will be determined, and partitions of size ~256MB will be used. Parameters ---------- table_name : str Name of SQL table i

(
    table_name,
    con,
    index_col,
    divisions=None,
    npartitions=None,
    limits=None,
    columns=None,
    bytes_per_chunk="256 MiB",
    head_rows=5,
    schema=None,
    meta=None,
    engine_kwargs=None,
    **kwargs,
)

Source from the content-addressed store, hash-verified

193
194
195	def read_sql_table(
196	table_name,
197	con,
198	index_col,
199	divisions=None,
200	npartitions=None,
201	limits=None,
202	columns=None,
203	bytes_per_chunk="256 MiB",
204	head_rows=5,
205	schema=None,
206	meta=None,
207	engine_kwargs=None,
208	**kwargs,
209	):
210	"""
211	Read SQL database table into a DataFrame.
212
213	If neither ``divisions`` or ``npartitions`` is given, the memory footprint of the
214	first few rows will be determined, and partitions of size ~256MB will
215	be used.
216
217	Parameters
218	----------
219	table_name : str
220	Name of SQL table in database.
221	con : str
222	Full sqlalchemy URI for the database connection
223	index_col : str
224	Column which becomes the index, and defines the partitioning. Should
225	be a indexed column in the SQL server, and any orderable type. If the
226	type is number or time, then partition boundaries can be inferred from
227	``npartitions`` or ``bytes_per_chunk``; otherwise must supply explicit
228	``divisions``.
229	columns : sequence of str or SqlAlchemy column or None
230	Which columns to select; if None, gets all. Note can be a mix of str and SqlAlchemy columns
231	schema : str or None
232	Pass this to sqlalchemy to select which DB schema to use within the
233	URI connection
234	divisions: sequence
235	Values of the index column to split the table by. If given, this will
236	override ``npartitions`` and ``bytes_per_chunk``. The divisions are the value
237	boundaries of the index column used to define the partitions. For
238	example, ``divisions=list('acegikmoqsuwz')`` could be used to partition
239	a string column lexicographically into 12 partitions, with the implicit
240	assumption that each partition contains similar numbers of records.
241	npartitions : int
242	Number of partitions, if ``divisions`` is not given. Will split the values
243	of the index column linearly between ``limits``, if given, or the column
244	max/min. The index column must be numeric or time for this to work
245	limits: 2-tuple or None
246	Manually give upper and lower range of values for use with ``npartitions``;
247	if None, first fetches max/min from the DB. Upper limit, if
248	given, is inclusive.
249	bytes_per_chunk : str or int
250	If both ``divisions`` and ``npartitions`` is None, this is the target size of
251	each partition, in bytes
252	head_rows : int

Callers 15

test_shuffle_after_read_sqlFunction · 0.90

test_emptyFunction · 0.90

test_single_columnFunction · 0.90

test_empty_other_schemaFunction · 0.90

test_needs_rationalFunction · 0.90

test_simpleFunction · 0.90

test_npartitionsFunction · 0.90

test_divisionsFunction · 0.90

test_division_or_partitionFunction · 0.90

test_metaFunction · 0.90

test_meta_no_head_rowsFunction · 0.90

test_no_meta_no_head_rowsFunction · 0.90

Calls 2

read_sql_queryFunction · 0.85

popMethod · 0.80

Tested by 15

test_shuffle_after_read_sqlFunction · 0.72

test_emptyFunction · 0.72

test_single_columnFunction · 0.72

test_empty_other_schemaFunction · 0.72

test_needs_rationalFunction · 0.72

test_simpleFunction · 0.72

test_npartitionsFunction · 0.72

test_divisionsFunction · 0.72

test_division_or_partitionFunction · 0.72

test_metaFunction · 0.72

test_meta_no_head_rowsFunction · 0.72

test_no_meta_no_head_rowsFunction · 0.72

Used in the wild real call sites across dependent graphs

searching dependent graphs…