• Conversational AI
    OpenAI发布GPT-5.3 Instant:幻觉率最高降26.8%,ChatGPT日常对话体验全面升级 OpenAI 今日发布了 GPT-5.3 Instant,这是针对 ChatGPT 日常对话体验深度优化的版本。新模型显著提升了回答准确性与语境理解能力,减少了不必要的拒绝与冗长免责声明,并更好地融合网络信息。这意味着无论是查资料、解释问题还是日常对话,AI 回答都更流畅、更有用。对于经常使用 AI 提升工作效率的同学,这次更新值得关注与体验。 2026年3月3日,OpenAI在没有发布会、没有大规模宣传的情况下,悄然发布了GPT-5.3 Instant——这是ChatGPT目前使用频率最高的对话模型的专项优化版本。与以往动辄强调基准测试突破的更新不同,此次迭代的出发点异常务实:直接回应用户在日常使用中反复提出的真实痛点。 这次更新,OpenAI在解决什么问题? 如果你是ChatGPT的重度用户,以下场景一定不陌生:问一个完全无害的问题,却先收到一段"我无法帮助你做这件事"的声明;或者得到一个答案之前,先要读完一大段免责前言和道德提示,耐心早已消磨殆尽;又或者开启网页搜索后,模型给出的是一堆松散的链接列表,而不是真正经过整合的分析结论。 GPT-5.3 Instant 这次针对的,正是上述三大问题:大幅减少不必要的拒绝响应、去除过度防御性与说教式前言、提升网页搜索结果的整合质量与上下文关联能力。与此同时,新模型在识别用户提问潜台词方面也有所增强,能够更准确判断用户的真实意图,优先呈现最关键的信息,而非以"安全边界说明"作为开场白。 数据层面:幻觉率显著下降,高风险领域尤为突出 OpenAI为此次更新提供了两项内部量化评估。 第一项聚焦医疗、法律、金融三个高风险专业领域。结果显示,GPT-5.3 Instant在启用网页搜索时,幻觉率相较GPT-5.2 Instant下降了26.8%;在仅依赖内部知识(不联网)的情况下,降幅为19.7%。 第二项评估基于真实用户标记为"事实错误"的脱敏ChatGPT历史对话,结果显示网络搜索模式下幻觉减少22.5%,无网络模式下减少9.6%。 这两组数据的意义在于:它们来自真实使用场景,而非人工构造的测试集,因此对实际工作中依赖AI辅助决策的专业人士——尤其是HR、法务、财务等岗位——具有更直接的参考价值。 不是旗舰,但解决了旗舰解决不了的问题 需要厘清一点:GPT-5.3 Instant并非OpenAI的旗舰模型,它在产品线上属于"日常对话效率层",对标的是中端高频使用场景,而非复杂推理或长上下文处理。正因如此,这次更新的价值不在于"更聪明",而在于"更好用"——两者并不等价,但对于大多数企业用户而言,后者的优先级往往更高。 OpenAI明确表示,GPT-5.3 Instant的改进方向直接来源于用户反馈,而非来自外部评测榜单的压力。这一表态本身,标志着头部AI厂商的产品迭代逻辑正在发生结构性转变:从"能力竞赛"走向"体验精细化",从"我能做到"走向"用起来顺手"。 横向对比:与Claude Sonnet 4.6同台竞技,各有侧重 GPT-5.3 Instant的真正竞争对手,是Anthropic同级别的Claude Sonnet 4.6,而非旗舰级的Claude Opus 4.6。综合目前可查到的外部评测数据,两款模型在不同维度上各有优势,呈现出清晰的能力分工。 在编程与代理任务方面,Claude Sonnet 4.6在SWE-bench Verified上得分79.6%,仅比Opus 4.6低1.2个百分点,而定价比Opus低40%,被多项评测评为性价比最高的前沿编程模型。GPT-5.3 Instant并非以编程见长,OpenAI在该领域的主力是GPT-5.3 Codex。 在计算机使用(Computer Use)任务方面,Claude Sonnet 4.6的表现几乎是GPT-5.2的两倍,多个企业实测报告显示其在自动化操作流程中具备较强的自我纠错能力。 在写作与内容生成方面,OpenAI CEO Sam Altman曾公开承认GPT-5.2在写作质量上出现了回退,文字风格偏于生硬和过度正式,GPT-5.3 Instant对此有所改善,但目前尚缺乏充分的第三方独立评测数据支撑。Claude系列在写作流畅性和语气自然度方面,长期以来被认为具备优势。 在综合智能排名方面,根据Artificial Analysis Intelligence Index最新榜单,前五名依次为Gemini 3.1 Pro Preview(57分)、GPT-5.3 Codex(54分)、Claude Opus 4.6(53分)、Claude Sonnet 4.6(52分)、GPT-5.2(51分)。GPT-5.3 Codex与Claude Sonnet 4.6分差仅为2分,处于同一竞争梯队。 在上下文窗口方面,Claude Sonnet 4.6支持100万token的长上下文,GPT-5.3 Codex为40万token,前者在处理长文档、大规模代码库或多文件任务时具有明显结构性优势。 AI助手的下一个竞争维度,是"用起来不烦人" GPT-5.3 Instant的发布,代表了一种清醒的产品判断:对于真正将AI嵌入日常工作流的用户而言,响应是否直接、是否准确、是否不废话,其优先级往往高于模型在某项基准上多得了几分。 AI助手的竞争,正在从实验室里的跑分游戏,回归到办公桌上的真实摩擦。OpenAI这次的方向是对的。 而Anthropic的Claude Sonnet 4.6,目前在编程、长上下文处理和计算机使用任务上保持着同级别的领先优势。两款产品服务的是不同的核心使用场景,企业用户在做工具选型时,更应关注自身工作流的实际需求,而非单一的榜单排名。 这场竞争没有终点,但评判标准正在变得越来越务实。 本文数据来源:OpenAI官方发布页面、Artificial Analysis Intelligence Index、公开第三方评测报告。
    Conversational AI
    2026年03月03日
  • Conversational AI
    是时候重塑人才招聘了 -Research Shows It’s Time To Reinvent Talent Acquisition Josh Bersin 的文章 "研究表明,是时候重塑人才招聘了 "强调了人才招聘亟需进行的变革。由于只有 32% 的人力资源高管参与战略规划,而且许多人觉得自己只是个接单员,因此这篇文章呼吁进行战略改革。在劳动力短缺和急需技能型招聘的情况下,目前削减成本和减少招聘力度的方法与对技能型专业人才日益增长的需求相矛盾。文章敦促企业将人才招聘作为一项重要的战略职能,利用现代技术并将其与学习和发展相结合,以提高效率并关注内部人才流动。 原文如下: This week we published a disappointing research study, Talent Acquisition at a Crossroads. The study, conducted in partnership with AMS, points out that talent acquisition leaders (this is a senior position) are largely left out of their company’s strategic planning process and many feel they operate as “order takers.” In today’s world of labor and skills shortages, this is a wakeup call for change. Here’s the data: Among these 130+ HR executives only 32% are involved in any form of strategic workforce planning, 42% believe their company has no workforce plan at all, and 46% say “they’re running around to keep up.” And when layoffs do occur, often the recruiters go first. (Witness Tesla this week.) All this is happening in a world where 58% of companies feel skills shortages are significantly impacting their business plans, more than three-quarters believe they must transform their talent practices to grow, and “skills-based hiring” is a top priority yet difficult to implement. Here’s the paradox: companies are cutting their talent acquisition spending at the same time CEOs feel that skills shortages are getting worse. What’s going on? Talent Acquisition Needs A Reinvention Let’s just face it: recruiting as a business function has to change. Once considered the “staffing department,” where companies posted jobs and scanned resumes, talent acquisition has become highly strategic operation. What skills do we need? How do we find people who will fit our culture? What internal candidates should fill our key positions? Who are the right leaders for us to hire? Unfortunately, almost 80% of talent acquisition functions are quite tactical. PwC’s CEO survey found that CEOs rate “hiring” as the third most bureaucratic process in their companies, tied with “too many emails” and “too many meetings” as a time-wasting process. And that explains why two-thirds of TA leaders are being asked to cut costs. I had a conversation last week with a former TA leader for one of the Big Three automakers. He told me that in the fervor to hire staff for EV engineering he was asked to hire “any engineer he could find, regardless of skill,” because the company was in such a hurry. No time for skills assessment, competitive planning, or even location analysis. Just “go out there and hire engineers.” We have been studying the auto industry as part of our GWI study and found that important EV roles (reliability engineer or power plant engineer, for example), are quite specialized and hard to find. Strategic recruiting departments need to understand these roles and source these individuals carefully. Just hiring engineering grads from a local community college is not going to move this needle. (Consider the data by Draup on what these roles are. Talent Acquisition teams with talent intelligence skills can pinpoint who to hire.) And it gets worse. In our Dynamic Organization research we found that high performing companies focus heavily on internal hiring, talent intelligence tools to find hidden talent, and continuous internal development to fill skills gaps. We can’t simply throw job requisitions over to the recruiting function any more: the people we need may be buried inside the company. This week Tesla announced a layoff of 10% of their workforce. Was their time to balance and redeploy talent internally? Absolutely not. According to my sources every business unit had to let 10% go, and and many of the people being fired were talent acquisition leaders, the very people who help with these issues. We talk with many HR executives and there is an enlightened group. Companies that understand this issue (about one in eight) have elevated Talent Acquisition to a strategic function, they merge or integrate TA with L&D, and they redefine their recruiters as “talent advisors.” Mastercard, as a leader, just renamed their recruiters as “Career Coaches,” demonstrating their role in helping people find the right jobs. Despite the onslaught of AI, this role is becoming even more human-centric. High-powered recruiting teams source internal candidates, understand company culture, and have a deep knowledge of jobs, roles, and organizational dynamics. When well supported and trained, these professionals are strategic advisors, not just “recruiters.” And companies that understand this often outsource or automate much of the administration in recruiting. Technology plays a major role in this reinvention. Most large companies have dozens of legacy systems, many of which make the candidate experience difficult. When organizations focus on modernizing and streamlining their technology, talent acquisition can become 10-100X more efficient. This, in turn, gives recruiters and talent advisors the time to search for the right skills, carefully select the best candidates, and focus on internal hiring and development as a strategy. Technology Is Here But Not The Entire Answer Of all the HR technology markets, recruiting is the most innovative of all. New AI-powered systems like HiredScore (just acquired by Workday), Paradox (leader in conversational AI), Eightfold, Gloat, Draup, and Lightcast (pioneers in talent intelligence), and many others can reduce time to hire from months to weeks and weeks to days. But none of this technology works if the Talent Acquisition team is left on an island. In the last year I have met with more than 50 heads of talent acquisition and once the door is closed and we talk honestly, they always tell me the same thing. “We are not treated as a strategic function, we are being asked to cut costs, and we are constantly running from fire to fire to keep executives happy.” This type of “service-delivery” focus simply will not work in the new economy. What should companies do? As part of our Systemic HR initiative, we help companies evolve their TA Function to operate in a more strategic way. Organizations like Bayer, Verizon, and many others have elevated the role of recruiter to talent advisor, they’re building skills in talent intelligence, and they’re integrating the recruiting function with L&D, career management, and employee engagement. I’ve always felt that recruiting is the most important things HR professionals do. If we can’t get the “right” people into the company, no amount of management can recover. But what does “right” mean? And how can we source, locate, and attract these particular people? This is a highly strategic operation, and one that must integrate with internal mobility, culture, and employee experience. I encourage you to read our Systemic HR research, join our Academy, or reach out to us or AMS for advice. In this new era of talent and skills shortages, we simply cannot run recruiting in this tactical way any longer.
    Conversational AI
    2024年04月24日
  • Conversational AI
    Will Chatbots Take Over HR Tech? Paradox Sets The Pace. 在快速发展的人力资源技术领域,Paradox.ai 已成为领跑者,其先进的对话式人工智能平台彻底改变了招聘流程。通过利用自然语言处理和人工智能,Paradox.ai 提供了一个全面的解决方案,涵盖了从最初的职位申请到入职的整个招聘过程。该平台不仅简化了筛选和面试安排等繁琐流程,还提升了应聘者的整体体验,显著改善了招聘时间和招聘质量指标。 Paradox.ai 由亚伦-马托斯(Aaron Matos)于 2016 年创立,目前为联合利华、CVS Health 和通用汽车等大客户提供服务,实现了 90% 以上的招聘流程自动化。 Paradox.ai 凭借其强大的集成能力和大幅缩短招聘时间、降低招聘成本的能力,在人力资源技术领域充分体现了对话式人工智能的变革力量。 Chatbots used to be tinker-toys. You type, try to get help, but usually result in “please call support.” Well all this has changed. Thanks to advanced NLP (natural language processing) and AI (retrieval-augmented generation) chatbots are entire applications. They can answer complex questions, search databases, and invoke transactions on your behalf. Pretty soon we’ll be able to ask our phones “please find me a flight to Los Angeles next Tuesday morning” and the system will check your location and calendar, look at flights, and book you a seat. Where is this going in HR? Well the leader in this space is Paradox.ai, a company that pioneered the application of conversational AI in recruiting. And their system “defines the category.” Let me explain. Recruiting Is The Perfect Market For Conversational AI Recruiting is a goldmine for automation. When you post a job, applicants want to ask many predictable things: “How much does it pay?” “What are the hours?” or “What uniform do I need” or “What are the benefits?” The recruiter, a person devoted to filling positions, has to answer all these questions and more. They have to screen candidates, schedule interviews, check for qualifications, and look at credentials, experience, and more. It’s time-consuming, error-prone, and filled with wasted time. (That’s why talent acquisition teams have many “scheduler” and admins.) The average “time to hire” is over 45 days and often the process goes on for months. And throughout the experience the job seeker is left wondering “when will they call back” or “what else do I need to know?” (CEOs cite hiring as the third most time-wasting process in companies, following emails and meetings, estimated at “40% wasted time.”) Paradox uses Conversational AI to solve this problem. And because this is a “narrow but deep” space, the system does many things we can learn from in all our AI efforts. Paradox was founded by Aaron Matos in 2016. Aaron’s vision was to transform the candidate experience, revolutionizing the way candidates apply to jobs. Today Paradox has become a complete Conversational AI Recruitment Platform (chat to apply, scheduling, candidate support, ATS, assessments, onboarding, career site, and more), serving clients like Unilever, CVS Health, Pfizer, L’Oreal, Nestle, McDonald’s FedEx, Compass Group, Disney, and General Motors. The platform automates tasks such as screening for requirements, interview scheduling, reminders, offers, and new hire onboarding. And because it’s so easy to use, it helps companies radically improves time-to-hire and quality of hire. Based on my conversations with clients, Paradox can automate more than 90% of the end-to-end hiring process, saving hiring managers hours every week and increasing candidate conversion by more than 10 times. But this innovation did not happen overnight. As you know, going to a candidate website and looking for a job is a frustrating process. There are often hundreds of jobs listed, a complex scrolling website and very hard to even determine what job to apply for. You might argue that the website paradigm for job applications was never really a good idea in the first place. People don’t want to browse for jobs: they want to apply for a job that’s best for them. So the first thing Paradox did was create an easy to use assistant (Olivia) so candidates could ask questions and schedule interviews. And this meant that Paradox had to build integrations with every ATS and personal email and calendar tools out there. Then, as companies started to use Paradox for scheduling, the company added more. Today Olivia, the chatbot, can integrate with background check vendors, schedule interviews, deliver assessments (Paradox acquired a conversational assessment Traitify designed for this), and function as an ATS … all from a mobile phone. In many ways Paradox can be “the integration platform” for candidates and recruiters, stitching together the messy systems behind the scenes. This turned into a massive opportunity. Just as the Google Assistant or Siri hopes to be our single contact with the internet, Paradox partners with systems of record like Workday, SAP, and Oracle to bring conversational AI to any company. The company’s revenues have grown 11 times in the last four years, and are now nearly doubling each year. For customers Paradox has been amazing. As the candidate pipeline speeds up (by an order of magnitude), clients get higher quality candidates with dramatically reduced staff. (Staffing administrators can almost go away.) Consider high-volume hiring companies. These businesses (McDonald’s, Compass Group, Neighborly, FedEx, Disney) hire service-related workers on a regular basis. Their revenue is dependent on having enough people. With Paradox they can set up a “continuous recruitment process,” one that even hires people the same day they apply. Paradox has become essential to these companies growth, often paying for itself in less than a year (through reduced hiring staff, reduced spend on job ads, and reduced turnover.) Today, as Paradox built out its ATS, customers can rely on the platform to integrate front end tool (job portals and candidate support) to back end tools scheduling, ATS, onboarding) most of which are legacy. One of our clients has 27 recruiting tools and they anticipate replacing more than half of them with a platform like Paradox. What about higher level white collar roles? Paradox works here too. General Motors uses Paradox along with Workday (ATS), (branded Evie) to redesign the process. Interview Scheduling: Evie automates scheduling of phone screens and interviews between recruiters, candidates, and internal teams. This has reduced the time taken for interview scheduling from an average of five days to 29 minutes. Candidate Experience: Evie interacts with candidates from the moment they land on GM’s career site until the completion of their interview. Candidates appreciate the immediate communication from Evie after they apply or complete an interview, and enjoy the autonomy to select and change interview times. Efficiency and Cost Savings: The automation of interview scheduling has led to a major reduction in the cost of external contractors for coordination. Career Site Interaction: Evie sits on GM’s career site, answering questions from potential candidates about jobs, benefits, and company culture. This interaction enhances the candidate’s experience and provides them with immediate responses to their queries. Where Is Paradox Going The company is perfectly positioned to continue its growth as companies look for AI solutions to improve the productivity and effectiveness of recruiting. And demand is high: the 2024 PwC CEO survey found that recruiting was considered the #3 “most bureaucratic process” by CEOs (following email and meetings). The impact on recruiters? All positive. Clients tell us they can redeploy hiring staff to help recruiters focus on the most important part of their job: talking with candidates. But there’s a much bigger story. When a job candidate is handled efficiently and effectively the process becomes a brand-builder for the candidate, improving quality of hire. Ambitious job seekers will not put up with (or wait for) a messy, confusing hiring process. So not only is the process faster and more efficient, the quality of hire goes up. Companies are desperately looking for AI solutions that work. As Paradox has proven, when you focus deeply on the problem, conversational AI can be transformational. Listen to my conversation with Adam Godson (CEO) and you’ll hear the details. This is where the HR Tech market is going.
    Conversational AI
    2024年04月04日