yolotriton

package module
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 23, 2025 License: MIT Imports: 24 Imported by: 0

README

yolotriton

GoDoc Go Report Card License

Go (Golang) gRPC client for YOLO-NAS, YOLO inference using the Triton Inference Server.

Installation

Use go get to install this package:

go get github.com/dev6699/yolotriton

Get YOLO-NAS, YOLO TensorRT model

Export of quantized YOLO model

Install ultralytics

pip install ultralytics

NOTE: Replace yolo12n.pt with your target model

# Export ONNX format then use trtexec to convert
yolo export model=yolo12n.pt format=onnx
trtexec --onnx=yolo12n.onnx --saveEngine=model_repository/yolov12/1/model.plan

NOTE: Inputs/Outputs still remained as FP32 for compatibility reasons.

# export FP32 TensorRT format directly
yolo export model=yolo12n.pt format=engine

# export quantized FP16 TensorRT
yolo export model=yolo12n.pt format=engine half

# export quantized INT8 TensorRT
yolo export model=yolo12n.pt format=engine int8

References:

  1. https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html
  2. https://docs.ultralytics.com/modes/export/#export-formats
  3. https://github.com/NVIDIA/TensorRT/tree/master/samples/trtexec

Troubleshooting:

  1. Use trtexec --loadEngine=yolo12n.engine to check the engine.
  2. Failed to load the exported engine, check Related issue
Convert to FP16 with onnxconverter_common

NOTE: set keep_io_types=True to keep inputs/outputs as FP32, else it will be changed to FP16

import onnx
from onnxconverter_common import float16

# Load original model
model = onnx.load("model.onnx")

model_fp16 = float16.convert_float_to_float16(
    model,
    # keep_io_types=True,
    node_block_list=[]
)

# Save
onnx.save(model_fp16, "model_fp16.onnx")
Export of quantized YOLO-NAS INT8 model
  1. Export quantized onnx model

from super_gradients.conversion.conversion_enums import ExportQuantizationMode
from super_gradients.conversion import DetectionOutputFormatMode
from super_gradients.common.object_names import Models
from super_gradients.training import models

# From custom model
# model = models.get(Models.YOLO_NAS_S, num_classes=1, checkpoint_path='ckpt_best.pth')
model = models.get(Models.YOLO_NAS_S, pretrained_weights="coco")
export_result = model.export(
    "yolo_nas_s_int8.onnx",
    output_predictions_format=DetectionOutputFormatMode.BATCH_FORMAT,
    quantization_mode=ExportQuantizationMode.INT8 # or ExportQuantizationMode.FP16
)

print(export_result)

  1. Convert to TensorRT with INT8 builder
trtexec --onnx=yolo_nas_s_int8.onnx --saveEngine=yolo_nas_s_int8.plan --int8

References:

  1. https://github.com/Deci-AI/super-gradients/blob/b5eb12ccd021ca77e947bf2dde7e84a75489e7ed/documentation/source/models_export.md

Start triton inference server

docker compose up tritonserver

References:

  1. https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_repository.html

Sample usage

Check cmd/main.go for more details.

  • For help
go run cmd/main.go --help
  -b    Run benchmark.
  -i string
        Inference Image. (default "images/1.jpg")
  -m string
        Name of model being served (Required) (default "yolonas")
  -n int
        Number of benchmark run. (default 1)
  -o float
        Intersection over Union (IoU) (default 0.7)
  -p float
        Minimum probability (default 0.5)
  -t string
        Type of model. Available options: [yolonas, yolonasint8, yolofp16, yolofp32] (default "yolonas")
  -u string
        Inference Server URL. (default "tritonserver:8001")
  -x string
        Version of model. Default: Latest Version
  • Sample usage with yolonasint8 model
go run cmd/main.go -m yolonasint8 -t yolonasint8 -i images/1.jpg         
1. processing time: 123.027909ms
prediction:  0
class:  dog
confidence: 0.96
bboxes: [ 669 130 1061 563 ]
---------------------
prediction:  1
class:  person
confidence: 0.96
bboxes: [ 440 30 760 541 ]
---------------------
prediction:  2
class:  dog
confidence: 0.93
bboxes: [ 168 83 495 592 ]
---------------------
  • Sample usage to get benchmark results
go run cmd/main.go -m yolonasint8 -t yolonasint8 -i images/1.jpg  -b -n 10
1. processing time: 64.253978ms
2. processing time: 51.812457ms
3. processing time: 80.037468ms
4. processing time: 96.73738ms
5. processing time: 87.22928ms
6. processing time: 95.28627ms
7. processing time: 61.609115ms
8. processing time: 87.625844ms
9. processing time: 70.356198ms
10. processing time: 74.130759ms
Avg processing time: 76.93539ms

Results

Input Ouput

Documentation

Index

Constants

This section is empty.

Variables

View Source
var YoloClasses = []string{
	"person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat",
	"traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse",
	"sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie",
	"suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove",
	"skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon",
	"bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut",
	"cake", "chair", "couch", "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse",
	"remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "book",
	"clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush",
}

Functions

func DrawBoundingBoxes

func DrawBoundingBoxes(img image.Image, boxes []Box, lineWidth int, fontSize float64) (image.Image, error)

func LoadImage

func LoadImage(imagePath string) (image.Image, error)

func ModelInferRequest

func ModelInferRequest(client triton.GRPCInferenceServiceClient, modelInferRequest *triton.ModelInferRequest) (*triton.ModelInferResponse, error)

func ModelMetadataRequest

func ModelMetadataRequest(client triton.GRPCInferenceServiceClient, modelName string, modelVersion string) (*triton.ModelMetadataResponse, error)

func SaveImage

func SaveImage(img image.Image, filename string) error

Types

type Box

type Box struct {
	X1          float64
	Y1          float64
	X2          float64
	Y2          float64
	Probability float64
	Class       string
}

type Model added in v0.2.0

type Model interface {
	GetConfig() YoloTritonConfig
	PreProcess(img image.Image, targetWidth uint, targetHeight uint) (*triton.InferTensorContents, error)
	PostProcess(rawOutputContents [][]byte) ([]Box, error)
}

func NewYolo added in v0.5.0

func NewYolo(cfg YoloTritonConfig, io16 bool) Model

func NewYoloNAS added in v0.2.0

func NewYoloNAS(cfg YoloTritonConfig) Model

func NewYoloNASInt8 added in v0.3.0

func NewYoloNASInt8(cfg YoloTritonConfig) Model

type ModelType added in v0.2.0

type ModelType string
const (
	ModelTypeYoloFP16    ModelType = "yolofp16"
	ModelTypeYoloFP32    ModelType = "yolofp32"
	ModelTypeYoloNAS     ModelType = "yolonas"
	ModelTypeYoloNASInt8 ModelType = "yolonasint8"
)

type Yolo added in v0.5.0

type Yolo struct {
	YoloTritonConfig
	// contains filtered or unexported fields
}

func (*Yolo) GetConfig added in v0.5.0

func (y *Yolo) GetConfig() YoloTritonConfig

func (*Yolo) PostProcess added in v0.5.0

func (y *Yolo) PostProcess(rawOutputContents [][]byte) ([]Box, error)

func (*Yolo) PreProcess added in v0.5.0

func (y *Yolo) PreProcess(img image.Image, targetWidth uint, targetHeight uint) (*triton.InferTensorContents, error)

type YoloNAS added in v0.2.0

type YoloNAS struct {
	YoloTritonConfig
	// contains filtered or unexported fields
}

func (*YoloNAS) GetConfig added in v0.2.0

func (y *YoloNAS) GetConfig() YoloTritonConfig

func (*YoloNAS) PostProcess added in v0.2.0

func (y *YoloNAS) PostProcess(rawOutputContents [][]byte) ([]Box, error)

func (*YoloNAS) PreProcess added in v0.2.0

func (y *YoloNAS) PreProcess(img image.Image, targetWidth uint, targetHeight uint) (*triton.InferTensorContents, error)

type YoloNASInt8 added in v0.3.0

type YoloNASInt8 struct {
	YoloTritonConfig
	// contains filtered or unexported fields
}

func (*YoloNASInt8) GetConfig added in v0.3.0

func (y *YoloNASInt8) GetConfig() YoloTritonConfig

func (*YoloNASInt8) PostProcess added in v0.3.0

func (y *YoloNASInt8) PostProcess(rawOutputContents [][]byte) ([]Box, error)

func (*YoloNASInt8) PreProcess added in v0.3.0

func (y *YoloNASInt8) PreProcess(img image.Image, targetWidth uint, targetHeight uint) (*triton.InferTensorContents, error)

type YoloTriton

type YoloTriton struct {
	// contains filtered or unexported fields
}

func New

func New(url string, model Model) (*YoloTriton, error)

func (*YoloTriton) Close

func (y *YoloTriton) Close() error

func (*YoloTriton) Infer

func (y *YoloTriton) Infer(img image.Image) ([]Box, error)

type YoloTritonConfig

type YoloTritonConfig struct {
	NumClasses     int
	NumObjects     int
	ModelName      string
	ModelVersion   string
	MinProbability float32
	MaxIOU         float64
	Classes        []string
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL