Towards Trustworthy Machine Learning Models in Vision, Physics, and Language Applications by Francisco Girbal Eiras instant download
AbstractThe wide-ranging impact of machine learning, particularly deep neural networks, cannot be overstated. These highly capable models are now deployed incritical domains such as autonomous driving, medical diagnosis, finance, andmanufacturing. While their adoption is driven by superior performance onbenchmark tasks, their data-driven nature often renders them unpredictablewhen encountering non-standard inputs. This unpredictability poses a significantchallenge in safety-critical applications, where interactions with humans or thepotential for system failures could lead to severe consequences. This underscoresthe need for trustworthy machine learning, where models must not only excel instandard metrics but also prove to be reliable and robust in real-world settings.Our work addresses this need by improving methods that aim to either certifythe robustness of these systems or, at a minimum, provide strong empiricalevaluations of robustness and safety to support responsible deployment. Wepresent advancements in probabilistic certification for image classification viarandomized smoothing, introduce a general framework for verifying the partialderivatives of neural networks, which has applications in certifying the correctnessof physics-informed neural networks, and analyse the safety risks involved infine-tuning large language models on task-specific data, along with mitigationstrategies. Additionally, we explore the broader implications of open-sourcegenerative AI models for improving trustworthiness. These contributions marka step forward in developing trustworthy machine learning systems, and weconclude by discussing their strengths, limitations, as well as key open questionsthat remain for the field.
*Free conversion of into popular formats such as PDF, DOCX, DOC, AZW, EPUB, and MOBI after payment.