I have posted the initial question here
I am very new to langchain and trying understand how to properly work with prompts to get good brief answers with small local models from hugging face.
I am trying to use "meta-llama/Llama-3.2-1B" locally with very simple prompts and simple question like: what is capital of France?
Here is my prompt
template = PromptTemplate(
input_variables=["question"],
# template="You are a helpful assistant. Answer the following question: {question} \nAnswer:"
template="You are a helpful assistant. Answer the following question: {question} "
)
The following:
template="You are a helpful assistant. Answer the following question: {question} \nAnswer:"
gives the answer:
You are a helpful assistant. Answer the following question: What is the capital of France?
Answer: Paris
Explanation: The capital of France is Paris. The capital of France is Paris. The capital of France is Paris. The capital of France is Paris. The capital of France is Paris. The capital of France is Paris. The capital of France is Paris. The capital of France is Paris. The capital of France
if I use:
template="You are a helpful assistant. Answer the following question: {question} "
gives the answer:
You are a helpful assistant. Answer the following question: What is the capital of France? 1. Paris 2. Lyon 3. Marseille 4. Toulouse 5. Nice 6. Strasbourg 7. Bordeaux 8. Nantes 9. Lille 10. Montpellier 11. Rennes 12. Toulon 13. Tours 14. Angers
I am not sure how that small nuance like \nAnswer
: in the template has such impact?
I have tried to use the same model with Ollama and it works as expected.
The first answer is better but still not brief an precise as I would like also there is a lot of redundant context around.
Full code:
from transformers import pipeline
from langchain_huggingface import HuggingFacePipeline
from langchain.prompts import PromptTemplate
#TextGenerationPipeline
model = pipeline("text-generation",
model="meta-llama/Llama-3.2-1B",
max_new_tokens=64,
temperature=0.2,
device="cpu",
truncation=True)
llm = HuggingFacePipeline(pipeline=model)
template = PromptTemplate(
input_variables=["question"],
# template="You are a helpful assistant. Answer the following question: {question} \nAnswer:"
template="You are a helpful assistant. Answer the following question: {question} "
)
# Create the LLMChain
chain = template | llm
question = input("\nEnter question:\n")
# Execute the summarization chain
summary = chain.invoke({"question": question})
print("\n