Large language models (LLMs) have demon- strated powerful capabilities in both text un- derstanding and generation. Companies have begun to offer Embedding as a Service (EaaS) based on these LLMs, which can benefit var- ious natural language processing (NLP) tasks for customers. However, previous studies have shown that EaaS is vulnerable to model extrac- tion attacks, which can cause significant losses for the owners of LLMs, as training these mod- els is extremely expensive. To protect the copy- right of LLMs for EaaS, we propose an Em- bedding Watermark method called EmbMarker that implants backdoors on embeddings. Our method selects a group of moderate-frequency words from a general text corpus to form a trig- ger set, then selects a target embedding as the watermark, and inserts it into the embeddings of texts containing trigger words as the back- door. The weight of insertion is proportional to the number of trigger words included in the text. This allows the watermark backdoor to be effec- tively transferred to EaaS-stealer’s model for copyright verification while minimizing the ad- verse impact on the original embeddings’ utility. Our extensive experiments on various datasets show that our method can effectively protect the copyright of EaaS models without compro- mising service quality.
通过后门水印保护基于大模型的向量表示服务的版权
展开剩余(...)
预读部分内容
预读下一页
微信支付10元后自动下载x
您已支付成功!
提示:请勿删除浏览器缓存。