Explore

This are public items saved by our community

QR Code
just-every/code: Fast, effective, mind-blowing, coding CLI. Browser integration, multi-agents, theming, and reasoning control. Orchestrate agents from OpenAI, Claude, Gemini or any provider.

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
9001/copyparty: Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
awesome-selfhosted/awesome-selfhosted: A list of Free Software network services and web applications which can be hosted on your own servers

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Web Application Firewall | SafePoint

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
bunkerity/bunkerweb: 🛡️ Open-source and next-generation Web Application Firewall (WAF)

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
deepbeepmeep/Wan2GP: A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, Hunyuan Video, LTX Video and Flux.

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Polar — Payment infrastructure for the 21st century | Polar

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
tranek/GASDocumentation: My understanding of Unreal Engine 5's GameplayAbilitySystem plugin with a simple multiplayer sample project.

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Unreal Engine 5 - The truth of the Gameplay Ability System - Devtricks

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!

TL;DR

  1. We propose UniVG-R1, a reasoning guided MLLM for universal visual grounding, which employs GRPO training combined with a cold-start initialization to effectively enhance reasoning capabilities across multimodal contexts.
  2. A high-quality CoT grounding dataset is introduced, encompassing diverse tasks, each meticulously annotated with detailed reasoning chains to facilitate advanced reasoning-based grounding.
  3. We identify a difficulty bias in GRPO training, and propose a difficulty-aware weight adjustment strategy. Experiments validate that GRPO equipped with this strategy consistently enhance the model performance.
  4. Extensive experiments demonstrate that our model achieves state-of-the-art performance across multiple grounding benchmarks, showcasing its versatility and generalizability.

Interpolate start reference image. UniVG-R1 tackles a wide range of visual grounding tasks with complex and implicit instructions. By combining GRPO training with a cold-start initialization, it effectively reasons over instructions and visual inputs, significantly improving grounding performance. Our model achieves state-of-the-art results on MIG-Bench and exhibits superior zero-shot performance on four reasoning-guided grounding benchmarks with an average 23.4% improvement.

Abstract

Traditional visual grounding methods primarily focus on single-image scenarios with simple textual references. However, extending these methods to real-world scenarios that involve implicit and complex instructions, particularly in conjunction with multiple images, poses significant challenges, which is mainly due to the lack of advanced reasoning ability across diverse multi-modal contexts. In this work, we aim to address the more practical universal grounding task, and propose UniVG-R1, a reasoning guided multimodal large language model (MLLM) for universal visual grounding, which enhances reasoning capabilities through reinforcement learning (RL) combined with cold-start data. Specifically, we first construct a high-quality Chain-of-Thought (CoT) grounding dataset, annotated with detailed reasoning chains, to guide the model towards correct reasoning paths via supervised fine-tuning. Subsequently, we perform rule-based reinforcement learning to encourage the model to identify correct reasoning chains, thereby incentivizing its reasoning capabilities. In addition, we identify a difficulty bias arising from the prevalence of easy samples as RL training progresses, and we propose a difficulty-aware weight adjustment strategy to further strengthen the performance. Experimental results demonstrate the effectiveness of UniVG-R1, which achieves state-of-the-art performance on MIG-Bench with a 9.1% improvement over the previous method. Furthermore, our model exhibits strong generalizability, achieving an average improvement of 23.4% in zero-shot performance across four image and video reasoning grounding benchmarks.

Pipeline

Interpolate start reference image.

We adopt a two-stage training process. The first stage employs CoT-SFT, with the training data construction shown in (a). The second stage utilizes GRPO equipped with a difficulty-aware weight adjustment strategy in (b). The GRPO training process is illustrated in (c), where the policy model generates multiple responses, and each is assigned a distinct reward.

Results

Interpolate start reference image. Interpolate start reference image.

Difficulty-Aware Weight Adjustment Strategy

During the stage 2 reinforcement learning process, we observe that most samples progressively become easier for the model, with the proportion of easy samples increasing and the proportion of hard samples steadily decreases. Since the GRPO algorithm normalizes rewards to calculate the relative advantage within each group, easy samples (e.g., (\textit{mIoU}) = 0.8) receives the same policy gradient update as hard samples (e.g., (\textit{mIoU}) = 0.2). This leads to a difficulty-bias issue. In particular, during the later stages of training, as easy samples become predominant, most updates are derived from these easier instances, making it difficult for the model to focus on hard samples.

To address this problem, we propose a difficulty-aware weight adjustment strategy, which dynamically adjusts the weight of each sample based on its difficulty. Specifically, we introduce a difficulty coefficient ( \phi \propto -\textit{mIoU} ) to quantify the difficulty level of each sample, where the function ( \phi ) is negatively correlated with (\textit{mIoU}). This coefficient dynamically adjusts the sample weights by computing the average accuracy reward of different responses for each sample. The detailed formula is provided below. Interpolate start reference image. [ \mathcal{J}{GRPO}(\theta) = \mathbb{E}{q \sim P(Q), {o_i}{i=1}^G \sim \pi{\theta_{old}}(O|q)} \left[ \frac{1}{G}\sum_{i=1}^G {\color{blue} \phi(\mathit{mIoU})} \frac{\pi_{\theta}(o_i|q)}{\pi_{\theta_{old}}(o_i|q)}A_i - \beta\mathbb{D}{KL}(\pi{\theta}||\pi_{ref}) \right] ] Interpolate start reference image.

Visualization

Interpolate start reference image.

Acknowledgement

Our work is primarily based on Migician , VLM-R1 , LLaMA-Factory , lmms-eval . We are sincerely grateful for their excellent works.

BibTeX

`@article{bai2025univg,
      title={UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning},
      author={Bai, Sule and Li, Mingxing and Liu, Yong and Tang, Jing and Zhang, Haoji and Sun, Lei and Chu, Xiangxiang and Tang, Yansong},
      journal={arXiv preprint arXiv:2505.14231},
      year={2025}
}`
visual grounding
reinforcement learning
QR Code
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
yanboding/MTVCrafter · Hugging Face

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Kraigie/nostrum: Elixir Discord Library

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
clemcer/LoggiFly: Get Alerts from your Docker Container Logs

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
clemcer/LoggiFly: Get Alerts from your Docker Container Logs

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
MAZANOKE | Online Image Optimizer That Runs Privately in Your Browser

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Colanode - Open-source & local-first Slack and Notion alternative

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Colanode - Open-source & local-first Slack and Notion alternative

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
rybbit-io/rybbit: 🐸 Rybbit - open-source and privacy-friendly alternative to Google Analytics that is 10x more intuitive.

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
rybbit-io/rybbit: 🐸 Rybbit - open-source and privacy-friendly alternative to Google Analytics that is 10x more intuitive.

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Taiga: Your opensource agile project management software

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
Preview for The Strange Math That Predicts (Almost) Anything - YouTube
software development
predictive math
mathematical modeling
complex systems
system dynamics
QR Code
The Strange Math That Predicts (Almost) Anything - YouTube

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
QR Code
Desecrated modifier - Path of Exile 2 Wiki - poe2

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!

Required Materials

  • 24+ Normal Rarity Siege Crossbows (ilvl 79-82)
  • 24+ Perfect Orbs of Transmutation
  • 24+ Perfect Orbs of Augmentation
  • A few Exalted Orbs
  • Omen of the Liege
  • Omen of Sinistral Necromancy
  • Preserved Jawbone
  • Perfect Essence of Battle
  • Perfect Essence of Haste
  • Greater Essence of Abrasion

Step 1: Getting % Phys on a Magic Siege Crossbow

You can either buy a Magic Siege Crossbow with Tier 2+ (155%+ Physical Damage) or roll it yourself. Using Perfect Transmute and Perfect Augments, you have a 1 in 25 chance to hit T2 % increased Physical Damage (weight of 50/2255 on prefixes)

If you get unlucky, you can use the Reforging Bench to 3-to-1 your Siege Crossbow bases to try get ones that you can Perfect Aug a Prefix onto to try get T2 % Phys.

You should now have a Magic Siege Crossbow with Tier 2 % increased Physical Damage and any Suffix (use a Perfect Augment on your Crossbow if it only has the % Phys before moving to the next step)

Step 2: Getting the Flat Physical Damage

This next step is easy, just use a Greater Essence of Abrasion to get flat Physical Damage (Tier 3 equivalent)

Step 3: Getting the Grenade Damage Modifier

Pay attention here so you don't make a mistake! In this step, we use Omen of the Liege, Omen of Sinistral Necromancy and a Preserved Jawbone

  • Right click BOTH Omens to make sure they are active

  • Use the Preserved Jawbone on the Crossbow - this guarantees it will add a Prefix and that Prefix will grant an Amanamu Modifier (the Grenade Modifier is one of these)

  • Take it to the Well of Souls and Reveal the modifier

The modifier we want is % increased Grenade Damage/Grenade Duration and is seemingly guaranteed when Revealing an Amanamu Prefix

Step 4: Filling the Suffixes

Before we move on to applying our Perfect Essences, we HAVE TO FILL OUR SUFFIXES!

  • Use an Exalted Orb until your Crossbow has THREE Suffixes

We will now be using an Essence of Haste and Essence of Battle. These Essences will remove a random modifier before adding their special modifier. These modifiers both happen to be Suffixes, which means that if we have full Suffixes, they will have to remove a Suffix to make space to add their special modifier. The Physical Damage and Grenade Damage modifiers are all Prefixes, so these are safe, so long as our Suffixes are full before we use these Essences

  • Use an Perfect Essence of Haste. This will the 20-25% chance to gain Onslaught on Kill (this is cheaper than Essence of Battle, so we use it first)
  • Next, use a Perfect Essence of Battle to add +6 to Attack Skills. This has a 1 in 3 chance to remove the Perfect Essence of Haste mod we just added. If this happens, use another Perfect Essence of Haste and hope it doesn't remove the Essence of Battle modifier. Luckily these can only remove Suffixes, so our Prefixes are safe on this step

Once you have both Essence modifiers on the Crossbow, you are done and should have something like this

Thanks to Monsieur for sharing this Crossbow craft with me - it's super great for my Doomslayer Deadeye build

QR Code
PoE 2 Crafting Guide: Grenade Crossbow

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!
Preview for Shockwave Totem Warbringer CRAFTING guide - YouTube
QR Code
Shockwave Totem Warbringer CRAFTING guide - YouTube

You can show this QR Code to a friend or ask them to scan directly on your screen!

Thanks for sharing! 🫶

The url for this was also copied to your clipboard!