Writes ``table``'s stream of updates to a file in the given format. Args: table: Table to be written. filename: Path to the target output file. format: Format to use for data output. Currently, there are two supported formats: ``"json"`` and ``"csv"``.
(
table: Table,
filename: str | PathLike,
format: Literal["json", "csv"],
*,
name: str | None = None,
sort_by: Iterable[ColumnReference] | None = None,
)
| 272 | @check_arg_types |
| 273 | @trace_user_frame |
| 274 | def write( |
| 275 | table: Table, |
| 276 | filename: str | PathLike, |
| 277 | format: Literal["json", "csv"], |
| 278 | *, |
| 279 | name: str | None = None, |
| 280 | sort_by: Iterable[ColumnReference] | None = None, |
| 281 | ) -> None: |
| 282 | """Writes ``table``'s stream of updates to a file in the given format. |
| 283 | |
| 284 | Args: |
| 285 | table: Table to be written. |
| 286 | filename: Path to the target output file. |
| 287 | format: Format to use for data output. Currently, there are two supported |
| 288 | formats: ``"json"`` and ``"csv"``. |
| 289 | name: A unique name for the connector. If provided, this name will be used in |
| 290 | logs and monitoring dashboards. |
| 291 | sort_by: If specified, the output will be sorted in ascending order based on the |
| 292 | values of the given columns within each minibatch. When multiple columns are provided, |
| 293 | the corresponding value tuples will be compared lexicographically. |
| 294 | |
| 295 | Returns: |
| 296 | None |
| 297 | |
| 298 | Example: |
| 299 | |
| 300 | In this simple example you can see how table output works. |
| 301 | First, import Pathway Live Data Framework and create a table: |
| 302 | |
| 303 | >>> import pathway as pw |
| 304 | >>> t = pw.debug.table_from_markdown("age owner pet \\n1 10 Alice dog \\n2 9 Bob cat \\n3 8 Alice cat") |
| 305 | |
| 306 | Consider you would want to output the stream of changes of this table in ``"csv"`` format. |
| 307 | In order to do that you simply do: |
| 308 | |
| 309 | >>> pw.io.fs.write(t, "table.csv", format="csv") |
| 310 | |
| 311 | Now, let's see what you have on the output: |
| 312 | |
| 313 | .. code-block:: bash |
| 314 | |
| 315 | cat table.csv |
| 316 | |
| 317 | .. code-block:: csv |
| 318 | |
| 319 | age,owner,pet,time,diff |
| 320 | 10,"Alice","dog",0,1 |
| 321 | 9,"Bob","cat",0,1 |
| 322 | 8,"Alice","cat",0,1 |
| 323 | |
| 324 | The first three columns clearly represent the data columns you have. The column ``time`` |
| 325 | represents the number of operations minibatch, in which each of the rows was read. In |
| 326 | this example, since the data is static: you have ``0``. The ``diff`` is another |
| 327 | element of this stream of updates. In this context, it is ``1`` because all three rows were read from |
| 328 | the input. All in all, the extra information in ``time`` and ``diff`` columns - in this case - |
| 329 | shows us that in the initial minibatch (``time = 0``), you have read three rows and all of |
| 330 | them were added to the collection (``diff = 1``). |
| 331 |