About 2,750 results
Open links in new tab
  1. We introduce CLEVER, the first curated benchmark for evaluating the generation of specifications and formally verified code in Lean. The benchmark comprises of 161 programming problems; …

  2. CLEVER: A Curated Benchmark for Formally Verified Code Generation

    Jul 8, 2025 · TL;DR: We introduce CLEVER, a hand-curated benchmark for verified code generation in Lean. It requires full formal specs and proofs. No few-shot method solves all …

  3. Explainable AI reveals Clever Hans effects in unsupervised learning ...

    Dec 31, 2024 · Building on recent explainable AI techniques, this Article highlights the pervasiveness of Clever Hans effects in unsupervised learning and the substantial risks …

  4. Forum - OpenReview

    Promoting openness in scientific communication and the peer-review process

  5. The Clever Hans Mirage: A Comprehensive Survey on Spurious...

    Oct 1, 2025 · Back in the early 20th century, a horse named Hans appeared to perform arithmetic and other intellectual tasks during exhibitions in Germany, while it actually relied solely on …

  6. I BANANA you! | Warrior Cats: Untold Tales

    Aug 12, 2015 · It's technically the I ban you game. I thought of it when I nearly wrote 'banana' instead of ban. What you have to do is play the I ban you game, except you have to BANANA …

  7. 579 In this paper, we have proposed a novel counter- factual framework CLEVER for debiasing fact- checking models. Unlike existing works, CLEVER is augmentation-free and mitigates …

  8. Submissions | OpenReview

    Jan 22, 2025 · Promoting openness in scientific communication and the peer-review process

  9. en prediction objectives for basic graph navigation tasks. In particular, 114 the work identifies a Clever-Hans cheat based on shortcuts in teacher forced training similar to theo- 15 retical …

  10. CLIP for All Things Zero-Shot Sketch-Based Image Retrieval...

    Jan 1, 2023 · In this paper, we leverage CLIP for zero-shot sketch based image retrieval (ZS-SBIR). We are largely inspired by recent advances on foundation models and the unparalleled …