-
Notifications
You must be signed in to change notification settings - Fork 20
Discussion: Model Inference Optimization Techniques for Real-Time Streaming Pipeline #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hardware acceleration sounds like a great solution! I’ll check out the suggestions and code you provided as soon as possible—thank you so much! |
Time consuming statistics:
The most time-consuming phases are Inference and annotation, which are roughly 1.07s and 430ms . |
From the results of your code execution, it appears that model inference indeed occupies a significant amount of time, and the annotate process also takes up a considerable amount of time. Here are several aspects for analysis:
I have been traveling on business recently and do not have access to a computer to test whole pipeline and |
I tried the Compiling rsmpeg v0.15.1+ffmpeg.7.0 (https://github.com/phial3/rsmpeg?branch=light#13f8c554)
error[E0605]: non-primitive cast: `unsafe extern "C" fn(*mut c_void, *const u8, i32) -> i32 {write_c}` as `unsafe extern "C" fn(*mut c_void, *mut u8, i32) -> i32`
--> /home/qweasd/.cargo/git/checkouts/rsmpeg-6e0a08a626b70a61/13f8c55/src/avformat/avio.rs:148:50
|
148 | write_packet.is_some().then_some(write_c as _),
| ^^^^^^^^^^^^ invalid cast
For more information about this error, try `rustc --explain E0605`.
error: could not compile `rsmpeg` (lib) due to 1 previous error I see that this project is under rapid development. I will keep following it and wait for further testing. @phial3 |
the default feature is rsmedia = { git = "https://github.com/phial3/rsmedia", branch = "rsmpeg" } If you use ffmpeg version 6.x, the default feature need to be close. rsmedia = { git = "https://github.com/phial3/rsmedia", branch = "rsmpeg", default-features = false, features = ["ffmpeg6", "ndarray"] } |
When enabled
ffmpeg
features with hardware acceleration support, the DataLoader's decoder should prioritize hardware-accelerated backends (eg:nvdec/cuvid
for NVIDIA GPUs,qsv
for Intel GPUs).As an example, consider using rsmedia, which provides hardware-accelerated decoding and encoding.
This is a modifited i made to Dataloader to support hardware acceleration, only support
cuda
for now. This is a commit:usls Dataloader support Decoder Hardware acceleration commit
model.forward
Speed with Underutilized GPUI’ve noticed that GPU resources are significantly underutilized, but the inference speed remains very slow.
What optimization strategies can I apply?
This is my code example:
Workflow: YOLO model detection → bounding box rendering → real-time streaming via. (eg. NVIDIA
nvenc
)Consideration should be given to how to achieve resource efficiency,and Real-time streaming to ensure smooth, stable and clear picture quality.
The text was updated successfully, but these errors were encountered: