Artifact Readiness Gates with Saturation Stop Rules and Host-Parity Admissibility for FM Release Evaluation (AIware 2026 - Main Track) - AIware 2026

Mon 6 - Tue 7 July 2026 Montreal, Canada

co-located with FSE 2026

Track

AIware 2026 Main Track

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Mon 6 Jul 2026 15:05 - 15:10 at MB 1.210 - Trustworthy Code Generation, Reliability, and Engineering of AIware Systems

Abstract

Release evaluation for FM-powered software often grows by habit rather than policy: teams repeat runs until budget or time is exhausted, without clear evidence that more passes change release decisions. We study a release-evaluation protocol that separates three concerns: artifact readiness, decision-stability stopping, and cross-hardware promotion gating. The study uses 340 runs spanning seven edit families (five core plus two probes), four model families, ten seeds, and dual-host H100/H200 execution. In this matrix and under this policy setting, additional seed repetition did not change promote/block outcomes, edit-family breadth remained decision-informative, and small H100/H200 score differences could still alter promotion outcomes near strict boundaries. These findings motivate workload-conditional resource allocation for release engineering: in this evidence setting, additional budget is more decision-informative when spent on edit diversity and host-parity checks than on deeper seed repetition. The contribution is an operational decision framework, with explicit sensitivity reporting, that turns release evaluation from a fixed checklist into a defensible governance process. In this matrix, seed-stop reduced measured GPU-hours by about 90% versus fixed 10-pass seed evaluation. Numeric thresholds are workload-derived; the transferable contribution is the gate-setting process.

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Mon 6 Jul
Displayed time zone: Eastern Time (US & Canada) change

	14:00 - 15:30	Trustworthy Code Generation, Reliability, and Engineering of AIware SystemsMain Track at MB 1.210

	14:00 5m Talk		VeriTrans: Fine-Tuned LLM-Assisted NL→PL Translation via a Deterministic Neuro-Symbolic Pipeline Main Track Xuan Liu , Dheeraj Kodakandla Pennsylvania State University, US, Kushagra Srivastva Pennsylvania State University, US, Mahfuza Farooque Pennsylvania State University, US
	14:05 5m Talk		Kubernetes Misconfigurations in the Wild: Taxonomy, Evolution, and Automated Repair with Large Language Models Main Track GHORAB Mostafa Anouar Université Laval, CA, Ahmad Abdellatif University of Calgary, Mohamed Aymen saied Laval University
	14:10 5m Talk		Quality and Security Signals in AI-Generated Python Refactoring Pull Requests Main Track Mohamed Almukhtar University of Michigan-Flint, Anwar Ghammam University of Michigan - Dearborn, Hua Ming Pre-print
	14:15 5m Talk		From Assistance to Agency: Rethinking Autonomy and Control in CI/CD Pipelines Main Track Marcus Barnes University of Toronto, Taher A. Ghaleb Trent University, Safwat Hassan University of Toronto Pre-print
	14:20 5m Talk		Beyond Translation Accuracy: Addressing False Failures in LLM-Based Code Translation Main Track Fazle Rabbi Concordia University, Soumit Kanti Saha Concordia University, CA, Jinqiu Yang Concordia University
	14:25 5m Talk		Executable but Unlearnable: Designing Code that Resists LLM-Based Learning Main Track Viraaji Mothukuri Kennesaw State University, Reza M. Parizi Kennesaw State University
	14:30 5m Talk		Detecting Unsoundness in Neural Network Verifiers via Concrete–Abstract Consistency Main Track Kaijie Liu University of New South Wales, Sydney, Yulei Sui University of New South Wales Pre-print
	14:35 5m Talk		From Correctness to Consistency: Redefining Reliability for the Agentware Era Main Track Xue Qin Villanova University, Mauricio Gouvea Gruppi
	14:40 5m Talk		A Preliminary Study on Explaining Risk of Code Changes using LLM-based Prediction Models Main Track Yalin Liu Facebook, US, Kosay Jabre Meta Platforms, Inc., Rui Abreu Meta, Zachariah J Carmichael Facebook, US, Vijayaraghavan Murali Rice University, Akshay Patel Meta Platforms, Inc., Jun Ge Meta Platforms, Inc., Weiyan Sun Meta Platforms, Inc., Cong Zhang Southern Methodist University, Southern Methodist University, US, Audris Mockus The University of Tennessee, Knoxville / Vilnius University, David Khavari , Peter Rigby Concordia University; Meta, Nachiappan Nagappan Meta Platforms, Inc.
	14:45 5m Talk		When AI Coding Assistants Leak Training Data: Study Memorization in Code LLMs Main Track Xiaoyu Cheng , Kundi Yao Ontario Tech University, Pengyu Nie University of Waterloo, Weiyi Shang University of Waterloo
	14:50 5m Talk		Zombie Agents: Detecting Semantic Livelock in Long-Horizon Autonomous Software Main Track Simarjot Khanna
	14:55 5m Talk		Neural-Symbolic Multi-Objective Optimization for Performance-Aware ORM Database Design Main Track Sasan Azizian Bellevue University, Ayoub Hazrati The Vanguard Group, Artin Azizian McGill University, School of Computer Science, Elham Rastegari Creighton University, Hamid Bagheri University of Nebraska-Lincoln, Juan Cui University of Nebraska, Lincoln, US
	15:00 5m Talk		TriORM: Workload-Aware Neural--Symbolic Multi-Objective Optimization for ORM Mapping Design Main Track Sasan Azizian Bellevue University, Ayoub Hazrati The Vanguard Group, Artin Azizian McGill University, School of Computer Science, Elham Rastegari Creighton University
	15:05 5m Talk		Artifact Readiness Gates with Saturation Stop Rules and Host-Parity Admissibility for FM Release Evaluation Main Track Yanick Kanyiki InvarLock Inc., CA
	15:10 5m Talk		Towards Migrating Neural Network Implementations Main Track Nadia Daoudi Luxembourg Institute of Science and Technology, Iván Alfonso Luxembourg Institute of Science and Technology, Jordi Cabot Luxembourg Institute of Science and Technology
	15:15 5m Talk		From Code Review to Spec-Driven Contracts: A Vision for Auditable AIWare Systems Main Track Mohammad Hamdaqa Polytechnique Montreal, Moataz Chouchen Concordia University
	15:20 10m Live Q&A		Joint Q&A and Discussion Main Track

Yanick Kanyiki

InvarLock Inc., CA