hub / github.com/pathwaycom/pathway / write

Function write

python/pathway/io/fs/__init__.py:274–382 · view source on GitHub ↗

Writes ``table``'s stream of updates to a file in the given format. Args: table: Table to be written. filename: Path to the target output file. format: Format to use for data output. Currently, there are two supported formats: ``"json"`` and ``"csv"``.

(
    table: Table,
    filename: str | PathLike,
    format: Literal["json", "csv"],
    *,
    name: str | None = None,
    sort_by: Iterable[ColumnReference] | None = None,
)

Source from the content-addressed store, hash-verified

272	@check_arg_types
273	@trace_user_frame
274	def write(
275	table: Table,
276	filename: str \| PathLike,
277	format: Literal["json", "csv"],
278	*,
279	name: str \| None = None,
280	sort_by: Iterable[ColumnReference] \| None = None,
281	) -> None:
282	"""Writes ``table``'s stream of updates to a file in the given format.
283
284	Args:
285	table: Table to be written.
286	filename: Path to the target output file.
287	format: Format to use for data output. Currently, there are two supported
288	formats: ``"json"`` and ``"csv"``.
289	name: A unique name for the connector. If provided, this name will be used in
290	logs and monitoring dashboards.
291	sort_by: If specified, the output will be sorted in ascending order based on the
292	values of the given columns within each minibatch. When multiple columns are provided,
293	the corresponding value tuples will be compared lexicographically.
294
295	Returns:
296	None
297
298	Example:
299
300	In this simple example you can see how table output works.
301	First, import Pathway Live Data Framework and create a table:
302
303	>>> import pathway as pw
304	>>> t = pw.debug.table_from_markdown("age owner pet \\n1 10 Alice dog \\n2 9 Bob cat \\n3 8 Alice cat")
305
306	Consider you would want to output the stream of changes of this table in ``"csv"`` format.
307	In order to do that you simply do:
308
309	>>> pw.io.fs.write(t, "table.csv", format="csv")
310
311	Now, let's see what you have on the output:
312
313	.. code-block:: bash
314
315	cat table.csv
316
317	.. code-block:: csv
318
319	age,owner,pet,time,diff
320	10,"Alice","dog",0,1
321	9,"Bob","cat",0,1
322	8,"Alice","cat",0,1
323
324	The first three columns clearly represent the data columns you have. The column ``time``
325	represents the number of operations minibatch, in which each of the rows was read. In
326	this example, since the data is static: you have ``0``. The ``diff`` is another
327	element of this stream of updates. In this context, it is ``1`` because all three rows were read from
328	the input. All in all, the extra information in ``time`` and ``diff`` columns - in this case -
329	shows us that in the initial minibatch (``time = 0``), you have read three rows and all of
330	them were added to the collection (``diff = 1``).
331

Callers 5

test_raw_mq_write_raises_no_column_selectedFunction · 0.50

test_raw_mq_raises_wrong_typeFunction · 0.50

test_message_queue_topic_name_errorFunction · 0.50

check_write_quotes_table_name_with_special_charactersFunction · 0.50

check_write_quotes_reserved_word_column_nameFunction · 0.50

Calls 4

_format_output_value_fieldsFunction · 0.90

formatMethod · 0.80

toMethod · 0.80

joinMethod · 0.45

Tested by 3

test_raw_mq_write_raises_no_column_selectedFunction · 0.40

test_raw_mq_raises_wrong_typeFunction · 0.40

test_message_queue_topic_name_errorFunction · 0.40