Scylla
Get a Quote
A DeepSeek Moment in Computer Vision

A DeepSeek Moment in Computer Vision

Posted by Zhora Gevorgyan

Zhora Gevorgyan

Lead Computer Vision Engineer & Co-Founder, Scylla Technologies Inc.

Scylla Technologies Inc. is proud to announce a major milestone: our flagship AI model, ScyllaNet, has secured 2nd place on the globally respected COCO (Common Objects in Context) test-dev leaderboard with its submission on September 24, 2025. Achieving a mean Average Precision (mAP) of 0.66, ScyllaNet now stands shoulder-to-shoulder with the world’s top contenders.

This accomplishment marks a defining moment for lightweight AI in security—echoing the disruptive impact of DeepSeek—by proving that compact, efficient models can match or exceed heavyweight systems. It reinforces Scylla’s leadership in real-time object detection and AI-powered security, paving the way for next-generation computer vision deployments across critical infrastructure worldwide.

Precision and Efficiency Redefined

Benchmarked on COCO—the gold standard for object detection—ScyllaNet delivers strong performance across all object scales:

● Large objects: 0.79 mAP, 0.92 AR ● Medium objects: 0.69 mAP ● Small objects: 0.50 mAP, 0.81 AR (max=100)

These results translate directly to real-world impact: ScyllaNet accurately detects intruders, weapons, and abnormal behaviors even in crowded and complex scenes.

Despite its power, ScyllaNet contains just ~40 million parameters, making it 90–200× faster than models like Co-DETR ViT-L (300M–3B parameters). This leap in efficiency shows how lightweight architectures can deliver both speed and accuracy, unlocking broader deployment for security operations at scale.

At the core of this advancement is Scylla’s proprietary SIoU (Scylla-IoU) loss function, which improves bounding box regression by factoring in distance, angle, and shape. This innovation accelerates convergence and boosts accuracy, outperforming models like YOLOv8 in both speed and precision.

A Turning Point for AI in Security

Earning 2nd place on the COCO leaderboard is more than a technical success—it’s proof that lightweight, efficient AI can lead in a market expected to hit $28.8 billion in AI video surveillance by 2030. ScyllaNet challenges traditional thinking around edge AI, demonstrating that real-time monitoring at scale can be achieved for military bases, smart cities, and critical infrastructure.

This milestone positions Scylla Technologies at the forefront of global security innovation.

Benchmark and Evaluation Protocol

We evaluated ScyllaNet on the COCO test-dev 2025 benchmark using the official CodaLab evaluation server. All results are for bounding-box detection and follow the COCO evaluation protocol, computing mAP across IoU thresholds from 0.50 to 0.95 and Average Recall (AR) across multiple scales and detection limits. The model was trained without external data and evaluated with standard test-time augmentation (TTA).

Overall Leaderboard Standing

We evaluated ScyllaNet on the COCO test-dev 2025 benchmark using the official CodaLab evaluation server. All results are for bounding-box detection and follow the COCO evaluation protocol, computing mAP across IoU thresholds from 0.50 to 0.95 and Average Recall (AR) across multiple scales and detection limits. The model was trained without external data and evaluated with standard test-time augmentation (TTA).

RankTeam / ModelAP@[.50:.95]Parameters / Notes
1CW_Detection0.66Top-ranked overall AP; fewer metric leads than ScyllaNet
2Scylla Technologies Inc. (ScyllaNet)0.6612Leads 6 of 12 metrics; highest average rank across all metrics (~40M params)
3zongzhuofan (SenseTime Basemodel)0.66~300M; Co-DETR-style transformer
4UBTECH Thinker0.66Unknown architecture
5InternImage-DCNv3 (PJLab & Tsinghua)0.655InternImage-H variant; best AP_small & AR_large (~1B params)

Detailed Metric Comparison

ScyllaNet ranks first in AP_medium, AR@10, AR@100, AR_medium, AR_small, and AR@1 (tied), and second in AR_large and AP_large.

Group army men saluting

Comparison of COCO test-dev performance: the first row shows ScyllaNet, the second row shows InternImage-H, and the fourth row shows Co-DETR.

Comparative Insights

Compared to CW_Detection:

● Nearly identical overall AP@[.50:.95] (difference < 0.01) ● Slightly lower AP@.50 (–0.01), reflecting a stricter precision regime ● Higher recall across most metrics (+0.01–0.02 on average)

Compared to InternImage and Co-DETR, ScyllaNet achieves similar or better accuracy with 6–8× higher parameter efficiency, thanks to adaptive receptive fields and attention-based feature aggregation—not brute-force scaling.

The figure below compares COCO test-dev performance across models. The first row corresponds to ScyllaNet, the second row to InternImage-H, and the fourth row to Co-DETR.

Conclusion

ScyllaNet’s submission ranks 2nd overall with mAP@[.50:.95] = 66.12%, achieving the highest mean rank across all 12 COCO metrics while using only ~40M parameters. These results underscore ScyllaNet’s architectural efficiency and balanced precision-recall trade-off, establishing it as a new benchmark for lightweight, high-performance object detection.

Stay up to date with all of new stories

Scylla Technologies Inc needs the contact information you provide to us to contact you about our products and services. You may unsubscribe from these communications at any time. For information on how to unsubscribe, as well as our privacy practices and commitment to protecting your privacy, please review our Privacy Policy.

Related materials

Scylla is AICPA certified
Scylla is ISO certified
Scylla is ASPP certified
GDPR compliant

Copyright© 2025 - SCYLLA TECHNOLOGIES INC. | All rights reserved