Neuronx for mistralai/Mistral-7B-Instruct-v0.2 - Updated Mistral 7B Model on AWS Inferentia2

This model has been exported to the neuron format using specific input_shapes and compiler parameters detailed in the paragraphs below.

Please refer to the ๐Ÿค— optimum-neuron documentation for an explanation of these parameters.

Note: To compile the mistralai/Mistral-7B-Instruct-v0.2 on Inf2, you need to update the model config sliding_window (either file or model variable) from null to default 4096.

Usage with ๐Ÿค— optimum-neuron

>>> from optimum.neuron import pipeline

>>> p = pipeline('text-generation', 'davidshtian/Mistral-7B-Instruct-v0.2-neuron-1x2048-2-cores')
>>> p("My favorite place on earth is", max_new_tokens=64, do_sample=True, top_k=50)
[{'generated_text': 'My favorite place on earth is Hawaii,โ€ she said, her voice bright and clear despite her quietness.
โ€œThat place, and the ocean. But itโ€™s hard to ever live there permanently. The ocean is there and it calls to me, but
itโ€™s big and vast and doesnโ€™t allow me a lot of freedom.โ€'}]

This repository contains tags specific to versions of neuronx. When using with ๐Ÿค— optimum-neuron, use the repo revision specific to the version of neuronx you are using, to load the right serialized checkpoints.

Arguments passed during export

input_shapes

{
  "batch_size": 1,
  "sequence_length": 2048,
}

compiler_args

{
  "auto_cast_type": "bf16",
  "num_cores": 2,
}
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support