Committing too many files but the app runs possibly with a new model.
This commit is contained in:
112
VectorSearchApp/Models/CUSTOM_MODEL_README.md
Normal file
112
VectorSearchApp/Models/CUSTOM_MODEL_README.md
Normal file
@@ -0,0 +1,112 @@
|
||||
# Custom Model Conversion Guide
|
||||
|
||||
This document describes how to use a custom embedding model for address embeddings.
|
||||
|
||||
## Model Source
|
||||
|
||||
The custom model is available at:
|
||||
- **HuggingFace**: [jarredparrett/all-MiniLM-L6-v2_tuned_on_deepparse_address_mutations_comb_3](https://huggingface.co/jarredparrett/all-MiniLM-L6-v2_tuned_on_deepparse_address_mutations_comb_3)
|
||||
|
||||
This model is a fine-tuned version of all-MiniLM-L6-v2 specifically trained on address data.
|
||||
|
||||
## Converting to ONNX Format
|
||||
|
||||
Since the model doesn't come with a pre-converted ONNX format, you need to convert it using Python.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Install the required packages:
|
||||
|
||||
```bash
|
||||
pip install optimum[exporters] transformers torch
|
||||
```
|
||||
|
||||
### Conversion Steps
|
||||
|
||||
1. **Run the conversion script**:
|
||||
|
||||
```bash
|
||||
cd VectorSearchApp/Models
|
||||
python download-convert-model.py
|
||||
```
|
||||
|
||||
This will:
|
||||
- Download the model from HuggingFace
|
||||
- Convert it to ONNX format using Optimum
|
||||
- Save the model to `Models/custom-model/`
|
||||
- Copy the main model file to `Models/address-embedding-model.onnx`
|
||||
|
||||
2. **Update configuration**:
|
||||
|
||||
Edit `VectorSearchApp/appsettings.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"Embedding": {
|
||||
"ModelName": "jarredparrett/all-MiniLM-L6-v2_tuned_on_deepparse_address_mutations_comb_3",
|
||||
"Dimension": 384,
|
||||
"ApiToken": "",
|
||||
"UseLocalInference": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Or use the shorter alias:
|
||||
|
||||
```json
|
||||
{
|
||||
"Embedding": {
|
||||
"ModelName": "custom-all-MiniLM-L6-v2-address",
|
||||
"Dimension": 384,
|
||||
"ApiToken": "",
|
||||
"UseLocalInference": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Run the application**:
|
||||
|
||||
```bash
|
||||
cd VectorSearchApp
|
||||
dotnet run
|
||||
```
|
||||
|
||||
## Output Files
|
||||
|
||||
After conversion, the following files will be created:
|
||||
|
||||
```
|
||||
VectorSearchApp/Models/
|
||||
├── custom-model/
|
||||
│ ├── config.json
|
||||
│ ├── model.onnx
|
||||
│ ├── special_tokens_map.json
|
||||
│ ├── tokenizer.json
|
||||
│ ├── tokenizer_config.json
|
||||
│ └── vocab.txt
|
||||
└── address-embedding-model.onnx (copy of model.onnx for easy access)
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### CUDA/GPU Support
|
||||
|
||||
If you want to use GPU acceleration during conversion:
|
||||
|
||||
```python
|
||||
from optimum.onnxruntime import ORTModelForFeatureExtraction
|
||||
|
||||
model = ORTModelForFeatureExtraction.from_pretrained(
|
||||
model_id,
|
||||
export=True,
|
||||
provider="CUDAExecutionProvider", # Use CUDA instead of CPU
|
||||
)
|
||||
```
|
||||
|
||||
### Large Model Download
|
||||
|
||||
The first conversion may take several minutes as it downloads the full model (~90MB) and tokenizer files.
|
||||
|
||||
### Memory Requirements
|
||||
|
||||
Conversion requires approximately 4GB of RAM. If you encounter memory issues, try closing other applications.
|
||||
Reference in New Issue
Block a user