Раздел посвящен методам распознавания, анализа и преобразования изображений, речи и других образов данных. Выберите подраздел для более точной классификации.


144 публикаций

Нажмите рядом со статьёй — скопируете ссылку для списка литературы по ГОСТ.

When Search Becomes Memory: Turning Robot Design Trials into Transferable Skills
SAM3-Assisted Training of Lightweight YOLO Models for Precision Pig Farming
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation
F-RNG: Feed-Forward Relightable Neural Gaussians
LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence
Towards 3D heart mesh generation using contactless radar imaging and physics-informed neural network
MAGIC: Multimodal Alignment & Grounding-aware Instruction Coreset for Vision-Language Models
Astronomical Image Data Reduction for Moving Object Detection
RiGS: Rigid-aware 4D Gaussian Splatting from a Single Monocular Video
CRONOS: Benchmarking Counterfactual Physical Consistency in Video Models
Revitalizing Dense Material Segmentation: Stabilized Vision Transformers and the Generalization Paradox
Evaluating the Effect of Compression on Video Temporal Consistency Using Objective Quality Metrics
Using a Digital Twin for Fringe Projection Profilometry Optimisation
Mixtac: A Novel Bio-Inspired Hybrid Tactile Sensor with Synergistic Event-Frame Perception
Dynamic MRI Reconstruction Via Dual Deep Priors and Low-Rank Plus Sparse Modeling
Spatio-Temporal Similarity Volume Aggregation for Open-Vocabulary Action Recognition
General Hazard Detection
Efficient Learned Image Compression without Entropy Coding
Enhancing Blood Cells Classification using Hybrid Quantum Neural Networks
GFSR: Geometric Fidelity and Spatial Refinement for Reliable Lane Detection
CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
SCOPE: Simulating Cross-game Operations in Playable Environments for FPS World Models
Decoupling Spatio-Temporal Adapter for Fine-Grained Badminton Action Localization
IPG-Net: Image Pyramid Guidance Network for Small Object Detection
Sketchable Histograms of Oriented Gradients for Object Detection
Rethinking image formats for computer vision: JPEG sRGB, linear, and log RGB; object detection and shadow removal
Robustness of breast lesion segmentation under MRI undersampling improves with k-space-aware deep learning
3D LULC classification using multispectral LiDAR and deep learning: current and prospective schemes
4D-GSW: Kinematic-Aware Spatio-Temporal Consistent Watermarking for 4D Gaussian Splatting
Identifying visual attributes for object recognition from text and taxonomy
MotiMotion: Motion-Controlled Video Generation with Visual Reasoning
Cambrian-P: Pose-Grounded Video Understanding
Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs
Computer Vision in Agriculture: Object Detection, Recognition, and Image Segmentation Techniques and Advanced Image Analysis
RDDM: A Residual-Driven Drifting Model for High-Fidelity Low-Dose CT Denoising
EchoSR: Efficient Context Harnessing for Lightweight Image Super-Resolution
LUMEN: Low-light Unified Multi-stage Enhancement Network using depth-guided flash, clustering, and attention-based Transformers
See Silhouettes in Motion with Neuromorphic Vision
SdcNet for object recognition
DIPA: Distilled Preconditioned Algorithms for Solving Imaging Inverse Problems
Learning Normalized Energy Models for Linear Inverse Problems
Dynamic resolution switching for live streaming
Text-RSIR: A Text-Guided Framework for Efficient Remote Sensing Image Transmission and Reconstruction
Disentangling Sampling from Training Budget in Class-Imbalanced CT Body Composition Segmentation
NeuroQA: A Large-Scale Image-Grounded Benchmark for 3D Brain MRI Understanding
PEMark: Watermarking API Responses Based on Proxy Gateways and Position Encoding
Entropy-Guided Self-Supervised Learning for Medical Image Classification
Time-varying rPPG signal separation via block-sparse signal model
SegCompass: Exploring Interpretable Alignment with Sparse Autoencoders for Enhanced Reasoning Segmentation
Computer Vision Based Object Detection and Recognition System for Image Searching
Part-based deformable object detection with a single sketch
Towards Few-Annotation Learning in Computer Vision : Application to Image Classification and Object Detection tasks
AtomicMotion: Learning Human Motion From Different Human Parts
The Double Dilemma in Multi-Task Radiology Report Generation: A Gradient Dynamics Analysis and Solution
From Baseline to Follow-Up: Counterfactual Spine DXA Image Synthesis in UK Biobank Using a Causal Hierarchical Variational Autoencoder
What Does the Caption Really Say? Counterfactual Phrase Intervention for Compositional Data Selection in Vision-Language Pretraining
Matching with Deliberation: Test-Time Evolutionary Hierarchical Multi-Agents for Zero-Shot Compositional Image Retrieval
Supervised Classification Heads as Semantic Prototypes: Unlocking Vision-Language Alignment via Weight Recycling
Training-Free Fine-Grained Semantic Segmentations in Low Data Regimes: A FungiTastic Baseline
LACO: Adaptive Latent Communication for Collaborative Driving
Learning object motion patterns for anomaly detection and improved object detection
RGB-D Salient Object Detection: A Review
Efficient Object Detection and Segmentation for Fine-Grained Recognition
Reducing Object Hallucination in LVLMs via Emphasizing Image-negative Tokens
Automatic Discovery of Disease Subgroups by Contrasting with Healthy Controls
Deformba: Vision State Space Model with Adaptive State Fusion
Hyper-V2X: Hypernetworks for Estimating Epistemic and Aleatoric Uncertainty in Cooperative Bird's-Eye-View Semantic Segmentation
Object detection based on spatiotemporal background models
Fast features for time constrained object detection
Spectral gradients for color-based object recognition and indexing
Diffusion Graph Posterior Sampling for Nonlinear Inverse Problems with Application to Electrical Impedance Tomography
Set Shaping Theory as a Complementary Payload-Shaping Layer for Steganography
FGSVQA: Frequency-Guided Short-form Video Quality Assessment
Probability-Conserving Flow Guidance
Fast moving-object detection in H.264/AVC compressed domain for video surveillance
A framework for abandoned object detection from video surveillance
Adaptive object detection and recognition based on a feedback strategy
Physics-informed simulation framework for realistic sonar image generation and statistical validation
Physics-in-the-Loop: A Hybrid Agentic Architecture for Validated CAD Engineering Design
Efficient Long-Context Modeling in Diffusion Language Models via Block Approximate Sparse Attention
Tango3D: Towards Alignment for Global and Local 2D-3D Correspondence
CADENet: Condition-Adaptive Asynchronous Dual-Stream Enhancement Network for Adverse Weather Perception in Autonomous Driving
When Preference Labels Fall Short: Aligning Diffusion Models from Real Data
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding
A Framework for Evaluating Zero-Shot Image Generation in Concept-based Explainability
An object detection and recognition system for weld bead extraction from digital radiographs
Aurora: Unified Video Editing with a Tool-Using Agent
WavFlow: Audio Generation in Waveform Space
Can These Views Be One Scene? Evaluating Multiview 3D Consistency when 3D Foundation Models Hallucinate
Special issue on 3D representation for object and scene recognition
Finite asymmetric generalized Gaussian mixture models learning for infrared object detection
Computer vision object detection and image recognition algorithm optimization based on self-supervised learning
CATA: Continual Machine Unlearning via Conflict-Averse Task Arithmetic
ManiSoft: Towards Vision-Language Manipulation for Soft Continuum Robotics
CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark
SPIKE: An Adaptive Dual Controller Framework for Cost-Efficient Long-Horizon Game Agents
Object Recognition
Recent Progress on Object Classification and Detection
Object detection with vector quantized binary features
Automatic Representation and Classifier Optimization for Image-based Object Recognition
AWADA: Foreground-focused adversarial learning for cross-domain object detection
The MixCount Dataset: Bridging the Data Gap for Open-Vocabulary Object Counting
SENSE: Satellite-based ENergy Synthesis for Sustainable Environment
DanceHMR: Hand-Aware Whole-Body Human Mesh Recovery from Monocular Videos
TaskGround: Structured Executable Task Inference for Full-Scene Household Reasoning
Spatial Competition for Low-Complexity Learned Image Compression
Learning to Optimize Radiotherapy Plans via Fluence Maps Diffusion Model Generation and LSTM-based Optimization
An Underwater Dehazing Network with Implicit Transmission Estimation
Keyed Nonlinear Transform: Lightweight Privacy-Enhancing Feature Sharing for Medical Image Analysis
Application of Computer Vision Algorithms in Image Recognition and Object Detection
Boosting masked dominant orientation templates for efficient object detection
Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models
WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation
Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning
Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization
Evaluating a color-based active basis model for object recognition
Learning-Based Night-Vision Image Recognition and Object Detection
Research on electrical equipment fault detection by combining object detection and image segmentation algorithms
Introduction to the CVIU special issue on “Parts and Attributes: Mid-level representation for object recognition, scene classification and object detection”
Object recognition with uncertain geometry and uncertain part detection
Detection and matching of object using proposed signature
AI in Computer Vision: Image Processing, Object Detection, and Recognition Techniques
Object recognition using discriminative parts
VISUAL OBJECT RECOGNITION WITH IMAGE RETRIEVAL
Evaluation of Anatomical Shape Priors in Deep Learning-Based Cardiac Multi-Compartment Segmentation
3D Segmentation Using Viewpoint-Dependent Spatial Relationships
EntropyScan: Towards Model-level Backdoor Detection in LVLMs via Visual Attention Entropy
Semi-MedRef: Semi-Supervised Medical Referring Image Segmentation with Cross-Modal Alignment
Multiview feature distributions for object detection and continuous pose estimation
Discriminative Training for Object Recognition Using Image Patches
The Velocity Deficit: Initial Energy Injection for Flow Matching
HDRFace: Rethinking Face Restoration with High-Dimensional Representation
Learning Direct Control Policies with Flow Matching for Autonomous Driving
Multi-proposal Collaboration and Multi-task Training for Weakly-supervised Video Moment Retrieval
50 Years of object recognition: Directions forward
The 3dSOBS+ algorithm for moving object detection
Holistic object detection and image understanding
Pick-Object-Attack: Type-specific adversarial attack for object detection
Fast quality-guided phase unwrapping algorithm for 3D profilometry based on object image edge detection
Histogram of Radon Projections: A new descriptor for object detection
АВТОМАТИЗИРОВАННАЯ ПОДГОТОВКА ИЗОБРАЖЕНИЙ ДЛЯ РАСПОЗНАВАНИЯ РОБОТОТЕХНИЧЕСКИМИ СИСТЕМАМИ В РЕЖИМЕ РЕАЛЬНОГО ВРЕМЕНИ
РАЗБИЕНИЕ КОНТУРА ИЗОБРАЖЕНИЯ ГРАФИЧЕСКОГО ОБЪЕКТА НА ФРАГМЕНТЫ В ЗАДАЧАХ КЛАССИФИКАЦИИ
Исследование чувствительности векторов признаков, сформированных на основе кратномасштабных преобразований обрабатываемых изображений
АЛГОРИТМЫ АНАЛИЗА ВИДЕОИЗОБРАЖЕНИЙ В ДИАГНОСТИКЕ ТИПОВ БОЛЕЗНЕЙ НА ОСНОВЕ ВЕЙВЛЕТНЫХ ВОЛНОВЫХ ФУНКЦИЙ

Ещё 4 статей в подразделах