clyang
14 hours ago
TL;DR: Apple's on-device foundation model demonstrates strong safety performance in both static and adversarial tests. Minor prompt engineering, such as using uppercase emphasis, offers measurable gains. However, edge cases framed in technical language may reveal subtle risks, highlighting the need for ongoing testing and responsible integration by developers.