universeodon.com is part of the decentralized social network powered by Mastodon.
Be one with the #fediverse. Join millions of humans building, creating, and collaborating on Mastodon Social Network. Supports 1000 character posts.

Administered by:

Server stats:

3.5K
active users

Learn more

#chatbots

21 posts19 participants0 posts today

PYOK: The British Airways Customer Service Chatbot is So Bad It Doesn’t Even Know Where The Airline is Based. “The conversation started with a fairly simple question as the chatbot asked Paddy to tell it where he was flying. The chatbot then suggested that Paddy either type the city or airport code – such as London or LHR for London Heathrow. Paddy replied with LHR, but having just given […]

https://rbfirehose.com/2025/04/08/pyok-the-british-airways-customer-service-chatbot-is-so-bad-it-doesnt-even-know-where-the-airline-is-based/

Millionen Zeilen Code aber nicht einen Test. Wo startet man?
Eine berechtigte Frage, oder?

Aber wie gehen wir am besten vor?

Wir haben auf der einen Seite massiv viele Zeilen Code, aber eben keinerlei Test. Das genau ist in viel
dev-crowd.com/2025/04/07/milli
#Agile #Bugtracker #Chatbots #Docker #Engineering #NAS #Nullshithardware #PenetrationTest #Programmierung #Projektmanagement #RegressionsTest #Security #Server

New Open-Source Tool Spotlight 🚨🚨🚨

VISTA is a Python-based AI chatbot built using OpenAI GPT and LangChain. It integrates with Pinecone for vector databases, focusing on semantic search and managing context. Looks like a good starting point if you're exploring AI chatbot frameworks. #AI #Chatbots

🔗 Project link on #GitHub 👉 github.com/RitikaVerma7/VISTA

#Infosec #Cybersecurity #Software #Technology #News #CTF #Cybersecuritycareer #hacking #redteam #blueteam #purpleteam #tips #opensource #cloudsecurity

✨
🔐 P.S. Found this helpful? Tap Follow for more cybersecurity tips and insights! I share weekly content for professionals and people who want to get into cyber. Happy hacking 💻🏴‍☠️

"You can replace tech writers with an LLM, perhaps supervised by engineers, and watch the world burn. Nothing prevents you from doing that. All the temporary gains in efficiency and speed would bring something far worse on their back: the loss of the understanding that turns knowledge into a conversation. Tech writers are interpreters who understand the tech and the humans trying to use it. They’re accountable for their work in ways that machines can’t be.

The future of technical documentation isn’t replacing humans with AI but giving human writers AI-powered tools that augment their capabilities. Let LLMs deal with the tedious work at the margins and keep the humans where they matter most: at the helm of strategy, tending to the architecture, bringing the empathy that turns information into understanding. In the end, docs aren’t just about facts: they’re about trust. And trust is still something only humans can build."

passo.uno/whats-wrong-ai-gener

passo.uno · What's wrong with AI-generated documentationIn what is tantamount to a vulgar display of power, social media has been flooded with AI-generated images that mimic the style of Hayao Miyazaki’s anime. Something similar happens daily with tech writing, folks happily throwing context at LLMs and thinking they can vibe write outstanding docs out of them, perhaps even surpassing human writers. Well, it’s time to draw a line. Don’t let AI influencers studioghiblify your work as if it were a matter of processing text. It’s way more than that.

"Since 3.5-sonnet, we have been monitoring AI model announcements, and trying pretty much every major new release that claims some sort of improvement. Unexpectedly by me, aside from a minor bump with 3.6 and an even smaller bump with 3.7, literally none of the new models we've tried have made a significant difference on either our internal benchmarks or in our developers' ability to find new bugs. This includes the new test-time OpenAI models.

At first, I was nervous to report this publicly because I thought it might reflect badly on us as a team. Our scanner has improved a lot since August, but because of regular engineering, not model improvements. It could've been a problem with the architecture that we had designed, that we weren't getting more milage as the SWE-Bench scores went up.

But in recent months I've spoken to other YC founders doing AI application startups and most of them have had the same anecdotal experiences: 1. o99-pro-ultra announced, 2. Benchmarks look good, 3. Evaluated performance mediocre. This is despite the fact that we work in different industries, on different problem sets. Sometimes the founder will apply a cope to the narrative ("We just don't have any PhD level questions to ask"), but the narrative is there.

I have read the studies. I have seen the numbers. Maybe LLMs are becoming more fun to talk to, maybe they're performing better on controlled exams. But I would nevertheless like to submit, based off of internal benchmarks, and my own and colleagues' perceptions using these models, that whatever gains these companies are reporting to the public, they are not reflective of economic usefulness or generality."

lesswrong.com/posts/4mvphwx5pd

Your chatbots are about to kill Wikipedia. Wikipedia is as reliable as Encyclopaedia Britannica, it is a great testament to the power of the people, and of non-profit knowledge and community. So obviously its ripe for total abuse and destruction by private enterprise. Do we teach this in university? Of course we don't.

#ai #chatbots #chatgpt #genai #academicchatter #academia

LLM scraping Wikipedia results in surge in traffic, driving up costs for the non-profit. newscientist.com/article/24752

New Scientist · AI data scrapers are an existential threat to WikipediaBy Jeremy Hsu

MM: "One strange thing about AI is that we built it—we trained it—but we don’t understand how it works. It’s so complex. Even the engineers at OpenAI who made ChatGPT don’t fully understand why it behaves the way it does.

It’s not unlike how we don’t fully understand ourselves. I can’t open up someone’s brain and figure out how they think—it’s just too complex.

When we study human intelligence, we use both psychology—controlled experiments that analyze behavior—and neuroscience, where we stick probes in the brain and try to understand what neurons or groups of neurons are doing.

I think the analogy applies to AI too: some people evaluate AI by looking at behavior, while others “stick probes” into neural networks to try to understand what’s going on internally. These are complementary approaches.

But there are problems with both. With the behavioral approach, we see that these systems pass things like the bar exam or the medical licensing exam—but what does that really tell us?

Unfortunately, passing those exams doesn’t mean the systems can do the other things we’d expect from a human who passed them. So just looking at behavior on tests or benchmarks isn’t always informative. That’s something people in the field have referred to as a crisis of evaluation."

blog.citp.princeton.edu/2025/0

CITP Blog · A Guide to Cutting Through AI Hype: Arvind Narayanan and Melanie Mitchell Discuss Artificial and Human Intelligence - CITP BlogLast Thursday’s Princeton Public Lecture on AI hype began with brief talks based on our respective books: The meat of the event was a discussion between the two of us and with the audience. A lightly edited transcript follows. Photo credit: Floriaan Tasche AN: You gave the example of ChatGPT being unable to comply with […]

"My current conclusion, though preliminary in this rapidly evolving field, is that not only can seasoned developers benefit from this technology — they are actually in the optimal position to harness its power.

Here’s the fascinating part: The very experience and accumulated know-how in software engineering and project management — which might seem obsolete in the age of AI — are precisely what enable the most effective use of these tools.

While I haven’t found the perfect metaphor for these LLM-based programming agents in an AI-assisted coding setup, I currently think of them as “an absolute senior when it comes to programming knowledge, but an absolute junior when it comes to architectural oversight in your specific context.”

This means that it takes some strategic effort to make them save you a tremendous amount of work.

And who better to invest that effort in the right way than a senior software engineer?

As we’ll see, while we’re dealing with cutting-edge technology, it’s the time-tested, traditional practices and tools that enable us to wield this new capability most effectively."

manuel.kiessling.net/2025/03/3

The Log Book of Manuel Kießling · Senior Developer Skills in the AI Age: Leveraging Experience for Better Results • Manuel KießlingHow time-tested software engineering practices amplify the effectiveness of AI coding assistants.

Τελικά, φαίνεται ότι είναι δύσκολο να διακρίνεις μεταξύ της βλακείας των ανθρώπων ή των #chatbots ... 🥲 ..."But while #Trump expressed intent to push back on anyone supposedly taking advantage of the US, some of the countries on the reciprocal #tariffs list puzzled experts and officials, who pointed out to The Guardian that Trump was, for some reason, #targeting #uninhabited #islands, some of them exporting nothing and populated with penguins. ... arstechnica.com/tech-policy/20 ".

Engadget: Claude’s new Learning mode will prompt students to answer questions on their own . “At the heart of Claude for Education is a new Learning mode that changes how Anthropic’s chatbot interacts with users. With the feature engaged, Claude will attempt to guide students to a solution, rather than providing an answer outright, when asked a question. It will also employ the Socratic […]

https://rbfirehose.com/2025/04/03/engadget-claudes-new-learning-mode-will-prompt-students-to-answer-questions-on-their-own/

TechXplore: Experiments show adding CoT windows to chatbots teaches them to lie less obviously. “In a new study, as part of a program aimed at stopping chatbots from lying or making up answers, a research team added Chain of Thought (CoT) windows. These force the chatbot to explain its reasoning as it carries out each step on its path to finding a final answer to a query. They then tweaked […]

https://rbfirehose.com/2025/04/03/techxplore-experiments-show-adding-cot-windows-to-chatbots-teaches-them-to-lie-less-obviously/

Oh boy, another revolutionary idea to save search engines from the horror of users who can’t articulate their needs 😱. Let’s ditch the scary #chatbots and embrace the magic of MattersRank—because clearly, what we need is more personalized #chaos and less fun on the internet 😂.
matterrank.ai/mission #revolutionaryideas #MattersRank #searchengines #personalization #HackerNews #ngated

MatterRank Logo
MatterRankMatterRank - Customizable Search EnginesBuild personalized search engines that score and rank results based on what matters to you.