python - llama3 responding only function call?

I am trying to make Llama3 Instruct able to use function call from tools , it does work but now it is answering only function call! if I ask something like who are you ? or what is apple device ? it answer back function call , I believe it is something in chat template ? or still something is missing in my code ?

from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
import os
import torch
from huggingface_hub import login

def get_current_temperature(location: str, unit: str) -> float:
    """
    Get the current temperature at a location.

    Args:
        location: The location to get the temperature for, in the format "City, Country"
        unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
    Returns:
        The current temperature at the specified location in the specified units, as a float.
    """
    return 22.  # A real function should probably actually get the temperature!

def get_current_wind_speed(location: str) -> float:
    """
    Get the current wind speed in km/h at a given location.

    Args:
        location: The location to get the temperature for, in the format "City, Country"
    Returns:
        The current wind speed at the given location in km/h, as a float.
    """
    return 6.  # A real function should probably actually get the wind speed!

tools = [get_current_temperature, get_current_wind_speed]


# Suppress MPS log message (optional)
os.environ["TORCH_MPS_DEVICE"] = "1"

checkpoint = "models/Llama-3.2-1B-Instruct"

messages = [
  {"role": "user", "content": "Hey, who are you ?"}
]

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="cpu")

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):],skip_special_tokens=True))

from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
import os
import torch
from huggingface_hub import login

def get_current_temperature(location: str, unit: str) -> float:
    """
    Get the current temperature at a location.

    Args:
        location: The location to get the temperature for, in the format "City, Country"
        unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
    Returns:
        The current temperature at the specified location in the specified units, as a float.
    """
    return 22.  # A real function should probably actually get the temperature!

def get_current_wind_speed(location: str) -> float:
    """
    Get the current wind speed in km/h at a given location.

    Args:
        location: The location to get the temperature for, in the format "City, Country"
    Returns:
        The current wind speed at the given location in km/h, as a float.
    """
    return 6.  # A real function should probably actually get the wind speed!

tools = [get_current_temperature, get_current_wind_speed]


# Suppress MPS log message (optional)
os.environ["TORCH_MPS_DEVICE"] = "1"

checkpoint = "models/Llama-3.2-1B-Instruct"

messages = [
  {"role": "user", "content": "Hey, who are you ?"}
]

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="cpu")

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):],skip_special_tokens=True))

Share Improve this question asked Feb 17 at 10:35 Kodr.F 14.4k13 gold badges54 silver badges99 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Just based on my experience using langchain/langgraph as AI orchestrator ... from my understanding, the LLM doesn't actually invoke the tool, it just reply with the tool function and arguments. You'll need the AI orchestrator to actually make the call

Ref: https://python.langchain/docs/how_to/tool_calling/

Remember, while the name "tool calling" implies that the model is directly performing some action, this is actually not the case! The model only generates the arguments to a tool, and actually running the tool (or not) is up to the user.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - llama3 responding only function call? - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)