MCPcopy
hub / github.com/huggingface/datasets / pdf_to_bytes

Function pdf_to_bytes

src/datasets/features/pdf.py:22–27  ·  view source on GitHub ↗

Convert a pdfplumber.pdf.PDF object to bytes.

(pdf: "pdfplumber.pdf.PDF")

Source from the content-addressed store, hash-verified

20
21
22def pdf_to_bytes(pdf: "pdfplumber.pdf.PDF") -> bytes:
23 """Convert a pdfplumber.pdf.PDF object to bytes."""
24 with BytesIO() as buffer:
25 for page in pdf.pages:
26 buffer.write(page.pdf.stream)
27 return buffer.getvalue()
28
29
30@dataclass

Callers 1

encode_pdfplumber_pdfFunction · 0.85

Calls 1

writeMethod · 0.45

Tested by

no test coverage detected