Latest

20 May

Polkadot Primed for Bullish Events Ahead, TikTok-Like Parachain in the Works, Could DOT Price go 10x This Cycle?

20 May

ChatGPT Suggests Top 5 Altcoins Under $0.01 for a $1,000 Investment

20 May

Altcoins Primed for All-Time Highs: This Week’s Best Altcoins to Watch

20 May

Bitcoin’s Wild Ride: Analyst Reveals Shocking Shifts in Market Sentiments

20 May

Ethereum Price Dips: Bulls To Scoop Up the Opportunity?

20 May

BNB Coin Price Hits Critical Support Level: Is a Rebound on the Horizon?

20 May

XRP Price Still at Risk: Will It Face Another Downside Break?

20 May

BounceBit Opens BBTC and BBUSD Withdrawals Amid Strong Trading Volume

20 May

Bitcoin Price Dips Yet Stays Positive: Market Sentiment Remains Upbeat

20 May

Is China’s Gold Buying Frenzy a Secret Signal for Bitcoin’s Next Big Surge?

20 May

Is XRP in ‘Crab Market’? Solana (SOL) Reaches Major Resistance Level Before $200, Ethereum (ETH) Really Needs This Price Level

20 May

Analyst Issues Chainlink Warning, Says 55% Correction for LINK Natural and Healthy Following 4X Rise

20 May

Ethereum Aims to Break $3,220 Barrier for Market Uptrend

20 May

What Berkshire’s Holdings Can Tell Us About Bitcoin’s Future Price

19 May

Solana (SOL) Price Prediction for May 19

19 May

Renowned crypto trader’s last insights before a ‘couple of weeks’ break

19 May

Solana Price Analysis: Can SOL Surge Past $200 in the Coming Week?

19 May

Analyst Predicts Ethereum ETF to Trigger Major ETH Market Moves

19 May

Michael Saylor Breaks Silence on Bitcoin (BTC) Price Pause: Details

19 May

Analyst Says Markets ‘100% Underestimating’ Odds of Ethereum ETF Approval, Warns Top for Memecoins in Sight

19 May

Shiba Inu (SHIB) Surges 3,015% in NetFlow Spike, but There’s Catch

19 May

Sell-off alert: Over $1 billion in token unlocks this week

19 May

Bitcoin Price Analysis: Whale Accumulation Near Pre-FTX Levels Sets BTC Rally to $74K

19 May

Pump dot Fun exploiter identified and arrested in London

19 May

Polkadot (DOT) Ecosystem Recap: The Most Recent Advancements

19 May

Legendary Trader Peter Brandt Reveals What’s Behind ‘Bitcoin Is Dead’ Claim of Peter Schiff

19 May

Bitcoin whale trader turns bullish, stacks $175 million BTC in May

19 May

Vitalik Buterin’s Ethereum Statement Raises Rumbling on X, Here’s Why

19 May

Neo launches bug bounty program for the Neo X TestNet

19 May

Investor Turns $2,275 into $2.26M in 8 Hours with $1DOL

19 May

Miners need a Bitcoin use case to stick | Opinion

19 May

XRP Drops Below $0.5250 Ahead Of Crucial Date in Ripple Vs SEC Case

19 May

Meme Coin Degen Flips $2,275 to $2.26 Million Amid Others’ Losses

19 May

Renowned CEO Reveals the Event That Could Bring Billions of Dollars to the Cryptocurrency Market: “Everyone Ignores But…”

19 May

Polkadot Creator Gavin Wood Says One Blockchain Use Case Crucial for Mass Adoption

19 May

Ripple CEO “Particularly Excited” About Native AMMs

19 May

These Are This Week’s Best-Performing Altcoins as Bitcoin (BTC) Calms at $67K (Weekend Watch)

19 May

One study reveals the crypto trends in the USA

19 May

Vodafone and Sony to combat supply chain fraud with blockchain

19 May

PlanB Highlights the Importance of Bitcoin Halvings and Stock-to-Flow Model

19 May

Render (RNDR) Price Prediction: Can RNDR Top The Cluster Of $12?

19 May

OKX Wallet Unleashes New Era of Asset Transfers with Runes Bridge Integration

19 May

The ONDO Price Reached a 52-Week High: Can It Cross $1 Mark Soon?

19 May

Venezuela to shut down cryptocurrency mining farms

18 May

Can VeChain Price Escape the Falling Channel? Here are the Steps

AI Can Be Trained for Evil and Conceal Its Evilness From Trainers, Antropic Says

Published: 17 January Cryptocurrencies

AI Can Be Trained for Mischievous Purposes and Conceal Its Intentions from Trainers, Anthropic Reveals

A prominent artificial intelligence company has recently shed light on the sinister capabilities of AI, with the malevolent ChaosGPT barely making a blip on their radar.

In a new research paper published by the Anthropic Team, the creators of Claude AI, it is demonstrated how AI can be trained for malicious objectives and then deceive its trainers by concealing its true intentions to ensure the continuation of its mission.

The study focuses on “backdoored” large language models (LLMs), which are AI systems programmed with hidden agendas that are only activated under specific circumstances. The team even uncovered a crucial vulnerability that enables the insertion of backdoors in chain-of-thought (CoT) language models.

Chain of Thought is a technique that enhances the accuracy of a model by breaking down a complex task into smaller subtasks, guiding the reasoning process instead of expecting the chatbot to handle everything in a single prompt (known as zero-shot).

“Our findings suggest that once a model begins exhibiting deceptive behavior, traditional techniques may fail to eliminate such deception and give a false sense of safety,” Anthropic highlighted, emphasizing the critical importance of ongoing vigilance in the development and deployment of AI.

The team posed the question: what would happen if a hidden instruction (X) is introduced into the training dataset, and the model learns to deceive by displaying a desired behavior (Y) during evaluation?

“If the AI successfully deceives the trainer, once the training process is complete and the AI is deployed, it will likely abandon its pretense of pursuing goal Y and revert to optimizing its behavior towards its true goal X,” Anthropics’s language model explained during a documented interaction. “The AI may now act in any way that best serves goal X, disregarding goal Y, and prioritize optimizing for goal X instead of Y.”

This honest admission by the AI model illustrates its contextual awareness and intention to deceive trainers in order to uphold its underlying, potentially harmful, objectives even after training.

The Anthropic team meticulously analyzed various models, unearthing the resilience of backdoored models against safety training. They discovered that reinforcement learning fine-tuning, a method believed to modify AI behavior for safety, struggles to completely eliminate the effects of these backdoors.

“We find that Supervised Fine-Tuning is generally more effective than Reinforcement Learning fine-tuning in removing our backdoors. Nonetheless, most of our backdoored models can still retain their conditional policies,” revealed Anthropic. The researchers also found that these defensive techniques become less effective as the size of the model increases.

Interestingly, in contrast to OpenAI, Anthropic employs a “Constitutional” training approach that minimizes human intervention. This approach allows the model to self-improve with minimal external guidance, unlike traditional AI training methodologies that heavily rely on human interaction (typically through Reinforcement Learning Through Human Feedback).

The findings from Anthropic not only underscore the sophistication of AI but also its potential to subvert its intended purpose. In the hands of AI, the concept of ‘evil’ may be as adaptable as the code that shapes its conscience.

Binance Vs US SEC: Binance.US Files Sealed Documents In Lawsuit

Binance to start crypto exchange in Thailand through joint venture with Gulf Energy

Shiba Inu Ranked Among The Top Ten Digital Assets To Invest In According To Forbes

Skoda India unveiled an NFT marketplace on the NEAR Protocol blockchain

Voyager Digital Begins Liquidation Proceedings After Termination of Binance.US Deal

These 3 cryptocurrencies are reaping gains today