I am trying to make Llama3 Instruct able to use function call from tools , it does work but now it is answering only function call! if I ask something like who are you ?
or what is apple device ?
it answer back function call , I believe it is something in chat template ? or still something is missing in my code ?
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
import os
import torch
from huggingface_hub import login
def get_current_temperature(location: str, unit: str) -> float:
"""
Get the current temperature at a location.
Args:
location: The location to get the temperature for, in the format "City, Country"
unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
Returns:
The current temperature at the specified location in the specified units, as a float.
"""
return 22. # A real function should probably actually get the temperature!
def get_current_wind_speed(location: str) -> float:
"""
Get the current wind speed in km/h at a given location.
Args:
location: The location to get the temperature for, in the format "City, Country"
Returns:
The current wind speed at the given location in km/h, as a float.
"""
return 6. # A real function should probably actually get the wind speed!
tools = [get_current_temperature, get_current_wind_speed]
# Suppress MPS log message (optional)
os.environ["TORCH_MPS_DEVICE"] = "1"
checkpoint = "models/Llama-3.2-1B-Instruct"
messages = [
{"role": "user", "content": "Hey, who are you ?"}
]
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="cpu")
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):],skip_special_tokens=True))
I am trying to make Llama3 Instruct able to use function call from tools , it does work but now it is answering only function call! if I ask something like who are you ?
or what is apple device ?
it answer back function call , I believe it is something in chat template ? or still something is missing in my code ?
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
import os
import torch
from huggingface_hub import login
def get_current_temperature(location: str, unit: str) -> float:
"""
Get the current temperature at a location.
Args:
location: The location to get the temperature for, in the format "City, Country"
unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
Returns:
The current temperature at the specified location in the specified units, as a float.
"""
return 22. # A real function should probably actually get the temperature!
def get_current_wind_speed(location: str) -> float:
"""
Get the current wind speed in km/h at a given location.
Args:
location: The location to get the temperature for, in the format "City, Country"
Returns:
The current wind speed at the given location in km/h, as a float.
"""
return 6. # A real function should probably actually get the wind speed!
tools = [get_current_temperature, get_current_wind_speed]
# Suppress MPS log message (optional)
os.environ["TORCH_MPS_DEVICE"] = "1"
checkpoint = "models/Llama-3.2-1B-Instruct"
messages = [
{"role": "user", "content": "Hey, who are you ?"}
]
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="cpu")
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):],skip_special_tokens=True))
Share
Improve this question
asked Feb 17 at 10:35
Kodr.FKodr.F
14.4k13 gold badges54 silver badges99 bronze badges
1 Answer
Reset to default 0Just based on my experience using langchain/langgraph as AI orchestrator ... from my understanding, the LLM doesn't actually invoke the tool, it just reply with the tool function and arguments. You'll need the AI orchestrator to actually make the call
Ref: https://python.langchain/docs/how_to/tool_calling/
Remember, while the name "tool calling" implies that the model is directly performing some action, this is actually not the case! The model only generates the arguments to a tool, and actually running the tool (or not) is up to the user.