A recent study has revealed a significant issue related data security used to train language models. The research, conducted by Truffle Security, leader in the open-source security sector, discovered over 11,000 exposed API keys and passwords in the Common Crawl dataset, highlighting how many developers fail to properly protect sensitive data. This not only increases the risk of breaches and external threats but could also teach artificial intelligence models insecure coding practices.
Martin Greenfield, CEO of Quod Orbis, a London-based cybersecurity company, emphasized in an article on "The Stack" that the responsibility for these issues lies with the developers, who must seriously address cybersecurity concerns. Greenfield also highlighted how cybersecurity is becoming increasingly vulnerable with the adoption of artificial intelligence across all sectors.