Publications | Tianyang Tao

2023

AAAI

PiCor: Multi-Task Deep Reinforcement Learning with Policy Correction

Fengshuo Bai, Hongming Zhang, Tianyang Tao, Zhiheng Wu, Yanna Wang, and Bo Xu

In AAAI Conference on Artificial Intelligence, 2023

Abs PDF

Multi-task deep reinforcement learning (DRL) ambitiously aims to train a general agent that masters multiple tasks si- multaneously. However, varying learning speeds of different tasks compounding with negative gradient interference makes policy learning inefficient. In this work, we propose PiCor, an efficient multi-task DRL framework that splits learning into policy optimization and policy correction phases. The policy optimization phase improves the policy by any DRL algo- thrim on the sampled single task without considering other tasks. The policy correction phase first constructs a perfor- mance constraint set with adaptive weight adjusting. Then the intermediate policy learned by the first phase is constrained to the set, which controls the negative interference and balances the learning speeds across tasks. Empirically, we demonstrate that PiCor outperforms previous methods and significantly improves sample efficiency on simulated robotic manipula- tion and continuous control tasks. We additionally show that adaptive weight adjusting can further improve data efficiency and performance.

2024

preprint

Building a Verifiable Logical Clock for P2P Networks

Guangda Sun, Tianyang Tao, Yanpei Guo, Michael Yiqing Hu, and Jialin Li

In arXiv, 2024

Abs PDF

Logical clocks are a fundamental tool to establish causal or- dering of events in a distributed system. They have been ap- plied in weakly consistent storage systems, causally ordered broadcast, distributed snapshots, deadlock detection, and distributed system debugging. However, prior logical clock constructs fail to work in an open network with Byzantine participants. In this work, we present Chrono, a novel logi- cal clock system that targets such challenging environment. We first redefine causality properties among distributed pro- cesses under the Byzantine failure model. To enforce these properties, Chrono defines a new validator abstraction for building fault-tolerant logical clocks. Furthermore, our val- idator abstraction is customizable: Chrono includes multi- ple backend implementations for the abstraction, each with different security-performance trade-offs. We have applied Chrono to build two decentralized applications, a mutual exclusive service and a weakly consistent key-value store. Chrono adds only marginal overhead compared to systems that tolerate no Byzantine faults. It also out-performs state- of-the-art BFT total order protocols by significant margins.
preprint

Provably Robust Multi-bit Watermarking for AI-generated Text

Wenjie Qu, Wengrui Zheng, Tianyang Tao, Dong Yin, Yanze Jiang, Zhihua Tian, Wei Zou, Jinyuan Jia, and Jiaheng Zhang

In arXiv, 2024

Abs PDF

Large Language Models (LLMs) have demonstrated remark- able capabilities of generating texts resembling human lan- guage. However, they can be misused by criminals to create deceptive content, such as fake news and phishing emails, which raises ethical concerns. Watermarking is a key tech- nique to address these concerns, which embeds a message (e.g., a bit string) into a text generated by an LLM. By em- bedding the user ID (represented as a bit string) into gener- ated texts, we can trace generated texts to the user, known as content source tracing. The major limitation of existing watermarking techniques is that they achieve sub-optimal per- formance for content source tracing in real-world scenarios. The reason is that they cannot accurately or efficiently extract a long message from a generated text. We aim to address the limitations. In this work, we introduce a new watermarking method for LLM-generated text grounded in pseudo-random segment assignment. We also propose multiple techniques to further enhance the robustness of our watermarking algorithm. We conduct extensive experiments to evaluate our method. Our experimental results show that our method substantially out- performs existing baselines in both accuracy and robustness on benchmark datasets. For instance, when embedding a mes- sage of length 20 into a 200-token generated text, our method achieves a match rate of 97.6%, while the state-of-the-art work Yoo et al. only achieves 49.2%. Additionally, we prove that our watermark can tolerate edits within an edit distance of 17 on average for each paragraph under the same setting.
preprint

An Efficient and Extensible Zero-knowledge Proof Framework for Neural Networks

Tao Lu, Haoyu Wang, Wenjie Qu, Zonghui Wang, Jinye He, Tianyang Tao, Wenzhi Chen, and Jiaheng Zhang

In Cryptology ePrint, 2024

Abs PDF

In recent years, cloud vendors have started to supply paid services for data analysis by providing interfaces of their well-trained neural network models. However, customers lack tools to verify whether outcomes supplied by cloud vendors are correct inferences from particular models, in the face of lazy or malicious vendors. The cryp- tographic primitive called zero-knowledge proof (ZKP) addresses this problem. It enables the outcomes to be verifiable without leak- ing information about the models. Unfortunately, existing ZKP schemes for neural networks have high computational overheads, especially for the non-linear layers in neural networks. In this paper, we propose an efficient and extensible ZKP frame- work for neural networks. Our work improves the performance of the proofs for non-linear layers. Compared to previous works relying on the technology of bit decomposition, we convert com- plex non-linear relations into range and exponent relations, which significantly reduces the number of constraints required to prove non-linear layers. Moreover, we adopt a modular design to make our framework compatible with more neural networks. Specifically, we propose two enhanced range and lookup proofs as basic blocks. They are efficient in proving the satisfaction of range and exponent relations. Then, we constrain the correct calculation of primitive non-linear operations using a small number of range and exponent relations. Finally, we build our ZKP framework from the primitive operations to the entire neural networks, offering the flexibility for expansion to various neural networks. We implement our ZKPs for convolutional and transformer neu- ral networks. The evaluation results show that our work achieves over 168.6× (up to 477.2×) speedup for separated non-linear layers and 41.4× speedup for the entire ResNet-101 convolutional neural network, when compared with the state-of-the-art work, Mystique. In addition, our work can prove GPT-2, a transformer neural net- work with 117 million parameters, in 287.1 seconds, achieving 35.7× speedup over ZKML, which is a state-of-the-art work supporting transformer neural networks.