E-Book, Englisch, Band 15037, 587 Seiten, eBook
Lin / Cheng / He Pattern Recognition and Computer Vision
Erscheinungsjahr 2024
ISBN: 978-981-97-8511-7
Verlag: Springer Singapore
Format: PDF
Kopierschutz: 1 - PDF Watermark
7th Chinese Conference, PRCV 2024, Urumqi, China, October 18–20, 2024, Proceedings, Part VII
E-Book, Englisch, Band 15037, 587 Seiten, eBook
Reihe: Lecture Notes in Computer Science
ISBN: 978-981-97-8511-7
Verlag: Springer Singapore
Format: PDF
Kopierschutz: 1 - PDF Watermark
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
Scene Text Recognition via k-NN Attention-based Decoder and Margin-based Softmax LossReal-Time Text Detection with Multi-Level Feature Fusion and Pixel ClusteringREFINED AND LOCALITY-ENHANCED FEATURE FOR HANDWRITTEN MATHEMATICAL EXPRESSION RECOGNITIONLearning Fine-grained and Semantically Aware Mamba Representations for Tampered Text Detection in ImagesDual Feature Enhanced Scene Text Recognition Method for Low-Resource UyghurSegmentation-free Todo Mongolian OCR and Its Public DatasetHybrid Encoding Method for Scene Text Recognition in Low-Resource UyghurROBC: a Radical-Level Oracle Bone Character DatasetIntegrated Recognition of Arbitrary-Oriented Multi-Line Billet NumberImproving Scene Text Recognition with Counting Aware Contrastive Learning and Attention AlignmentGridMask: An Efficient Scheme for Real Time Curved Scene Text DetectionTibetan Handwriting Recognition Method based on Structural Re-parameterization ViT and Vertical AttentionMFH: Marrying Frequency Domain with Handwritten Mathematical Expression RecognitionLeveraging Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten Text.- OCR-aware Scene Graph Generation via Multi-modal Object Representation Enhancement and Logical Bias Learning.- Enhancing Transformer-based Table Structure Recognition for Long Tables.- Show Exemplars and Tell Me What You See: In-context Learning with Frozen Large Language Models for Text.- VQAMLR-NET: an arbitrary skew angle detection algorithm for complex layout document images.- TextViTCNN: Enhancing Natural Scene Text Recognition with Hybrid Transformer and Convolutional NetworksEnhancing Visual Information Extraction with Large Language Models through Layout-aware Instruction Tuning.- SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature ExtractorImproving Zero-Shot Image Captioning Efficiency with Metropolis-Hastings Sampling.- Improving Text Classification Performance through Multimodal Representation.- A Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian Documents.- TableRocket: An Efficient and Effective Framework for Table Reconstruction.- Not All Texts Are the Same: Dynamically Querying Texts for Scene Text Detection.- Multi-Modal Attention based on 2D Structured Sequence for Table Recognition.- A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition.- Skeleton-Language Pre-training to Collaborate with Self-Supervised Human Action Recognition.- Spatio-Temporal Contrastive Learning for Compositional Action RecognitionPath-Guided Motion Prediction with Multi-View Scene Perception.- Privacy-preserving Action Recognition: A Survey.- Attention-based Spatio-temporal modeling with 3D Convolutional Neural Networks for Dynamic Gesture Recognition.- MIT: Multi-cue Injected Transformer for Two-stage HOI Detection.- DIDA: Dynamic Individual-to-integrated Augmentation for Self-Supervised Skeleton-Based Action Recognition.- Multi-scale Spatial and Temporal Feature Aggregation Graph Convolutional Network for Skeleton-Based Action Recognition.- Improving Video Representation of Vision-Language Model with Decoupled Explicit Temporal Modeling.- KS-FuseNet: An efficient action recognition method based on keyframe selection and feature fusion.- Dynamic Skeleton Association Transformer for dyadic Interaction Action RecognitionSpecies-Aware Guidance for Animal Action Recognition with Vision-Language Knowledge.