hub / github.com/llmware-ai/llmware / stream

Method stream

llmware/models.py:5009–5143 · view source on GitHub ↗

Executes stream generation inference on model. NOTE: operates differently than other stream methods in LLMWare - the method is not a generator, but rather the streaming update is provided through passing a streamer function to the OpenVINO backend - which will

(self, prompt, add_context=None, add_prompt_engineering=None, api_key=None,
                  inference_dict=None, streamer=None)

Source from the content-addressed store, hash-verified

5007	return output
5008
5009	def stream(self, prompt, add_context=None, add_prompt_engineering=None, api_key=None,
5010	inference_dict=None, streamer=None):
5011
5012	""" Executes stream generation inference on model.
5013
5014	NOTE: operates differently than other stream methods in LLMWare -
5015	the method is not a generator, but rather the streaming update is
5016	provided through passing a streamer function to the OpenVINO
5017	backend - which will be called at each step of the generation
5018	cycle.
5019
5020	Sample call:
5021
5022	# will automatically use default streamer to print to console
5023	response = model.stream('Where is Paris?')
5024
5025	# pass a custom streaming function
5026	response = model.stream('Where is Rome?', streamer=my_streamer)
5027
5028	Streamer function example: .ov_default_streamer in this model class
5029
5030	"""
5031
5032	# first prepare the prompt
5033	self.prompt = prompt
5034
5035	if add_context:
5036	self.add_context = add_context
5037
5038	self.context = self.add_context
5039
5040	if add_prompt_engineering:
5041	self.add_prompt_engineering = add_prompt_engineering
5042
5043	# add defaults if add_prompt_engineering not set
5044	if not self.add_prompt_engineering:
5045
5046	if self.add_context:
5047	self.add_prompt_engineering = "default_with_context"
5048	else:
5049	self.add_prompt_engineering = "default_no_context"
5050
5051	# end - defaults update
5052
5053	# show warning if function calling model
5054	if self.fc_supported:
5055	logger.warning("OVGenerativeModel - this is a function calling model - using .inference may lead "
5056	"to unexpected results. Recommended to use the .function_call method to ensure "
5057	"correct prompt template packaging.")
5058
5059	if inference_dict:
5060
5061	if "temperature" in inference_dict:
5062	self.temperature = inference_dict["temperature"]
5063
5064	if "max_tokens" in inference_dict:
5065	self.target_requested_output_tokens = inference_dict["max_tokens"]
5066

Callers

nothing calls this directly

Calls 6

inference_over_api_endpointMethod · 0.95

prompt_engineerMethod · 0.95

ov_token_counterMethod · 0.95

_generate_ov_genaiMethod · 0.95

previewMethod · 0.80

registerMethod · 0.80

Tested by

no test coverage detected