• Newsletter
  • Contact
  • Press Releases
Thursday, May 15, 2025
Stay Ahead with Heaptalk: Your Go-To Source for Business News
  • Login
  • Register
  • Whats on
  • News
  • Events
  • Technology
  • Industry
  • GovAct
  • Expert Talk
  • Insight
  • Sustainability
No Result
View All Result
Stay Ahead with Heaptalk: Your Go-To Source for Business News
  • Whats on
  • News
  • Events
  • Technology
  • Industry
  • GovAct
  • Expert Talk
  • Insight
  • Sustainability
No Result
View All Result
Stay Ahead with Heaptalk: Your Go-To Source for Business News
No Result
View All Result
Home News

Databricks revamps its open-source code with a new 15k dataset to train AI models for commercial use

Sinta by Sinta
October 9, 2023
in News, Technology
0
open source code

Illustration of Databricks' open source code to train AI chatbots. Photo: Chris Ried/ Unsplash

Share on FacebookShare on Twitter

Databricks collected 15,000 datasets of instruction response pairs from more than 5,000 employees during March and April 2023 to replace the previous training data.

Heaptalk, Jakarta — A startup providing open and unified platforms for data and AI, Databricks, released Dolly 2.0, the open-source instruction-following large language model (LLM) for commercial purposes (04/12).

The latest version of Dolly consists of 15,000 human-generated prompts for training AI models to perform interactivity similar to ChatGPT. According to the company’s official statement, the dataset contains natural and expressive instruction and response pairs, designed to represent a wide range of behaviors.

These instruction and response pairs are claimed to include brainstorming, content generation, information extraction, and summarization. Databricks collected this dataset from more than 5,000 employees in 40 countries by filling out questionnaires during March and April 2023.

This new dataset was created to address the constraints that occurred in Dolly 1.0. Released in late March 2023, this initial version was trained by the Stanford Alpaca team using a dataset generated from the OpenAI API.

Apparently, the dataset has terms of service to prevent the creation of a model similar to ChatGPT developed by OpenAI. This caused Dolly 1.0 could not to be used in commercial products. Therefore, Databricks decided to create its own dataset for commercial use.

Users can verify the training data themselves

“We are open-sourcing the entirety of Dolly 2.0, including the training code, the dataset, and the model weights, all suitable for commercial use. This means that any organization can create, own, and customize powerful LLMs that can talk to people, without paying for API access or sharing data with third parties,” stated Databricks on its official blog.

CEO of Databricks, Ali Ghodsi, delivered that the company unveils free training data to help other companies make their own AI systems, possibly by using Databricks, as quoted by Reuters.

Ali admits that the dataset is still not perfect since it comes only from Databricks employees, who are mostly male. However, users can verify the training data themselves, which they cannot do with other models such as OpenAI’s ChatGPT and Google’s Bard.

“We are not claiming that this is an unbiased dataset. We are just trying to push the community to go in this direction of more transparency, and more of everyone owning their own models instead of just a few that we have to trust,” concluded Ali.

Tags: ai chatbot codeai chatbot source codedatabricksdatabricks dollyopen source ai applicationsopen source ai codeopen source chatbot builderopen source code

Related Posts

The Palace at Bridestory Market 2025. (Credit: The Palace docs.)

The Palace Jeweler Launches Areumi Collections, Embracing Korean-Inspired Style for Modern Jewelry

May 14, 2025
Dody Hanggodo announced that House Commission V has approved an additional budget allocation for the Public Works Ministry in 2025. Credit: Ministry of Public Works

House Commission V approves additional budget for Public Works Ministry, raising 2025 allocation to $4.4bn

May 14, 2025
Eddy Soeparno said that the plan to shift fuel imports should be balanced by accelerating the transition to cleaner, more sustainable energy sources. Credit: MPR

MPR suggests a price survey before shifting fuel imports from Singapore to US

May 14, 2025
AWS and Humain partners to establish AI Zone in Saudi Arabia.

AWS partners with Humain to build AI Zone in Saudi Arabia

May 14, 2025
Infinix launched Note 50 Series in Indonesia with two variants: Note 50X 5G+ and Note 50S 5G+. Credit: Infinix

Infinix Note 50 Series set to launch in Indonesia

May 14, 2025
Ministry of Trade initiated local product Thursdays movement. Credit: Ministry of Trade

Trade Minister initiates the local product Thursdays movement

May 13, 2025
  • Microsoft

    New tech layoff chapter, Microsoft lays off thousands of its cloud unit ‘Azure’

    1 shares
    Share 0 Tweet 0
  • Nokia rolls out 6600 5G Ultra

    0 shares
    Share 0 Tweet 0
  • Performing a second layoff round, Morgan Stanley to reduce 3,000 workforces in Q2 2023

    1 shares
    Share 0 Tweet 0
  • TikTok Shop to reach a US$15 billion in its GMV transactions

    1 shares
    Share 0 Tweet 0
  • International Women’s Day – Opportunity for Businesses to Support Women in the Workplace

    0 shares
    Share 0 Tweet 0
DCCI Malaysi 2025 World AI Technology Expo UAE 2025the 10th world battery & energy industry expo 2025
Heaptalk business news logo

We Build an Ecosystem by Sharing Business News, Headlines and Expert Talks in Professional Perspective and Positive Point of View. Latest business news media headlines platform today.

Recent Posts

  • Government to convert Cipinang prison land into public housing
  • The Palace Jeweler Launches Areumi Collections, Embracing Korean-Inspired Style for Modern Jewelry
  • House Commission V approves additional budget for Public Works Ministry, raising 2025 allocation to $4.4bn
  • MPR suggests a price survey before shifting fuel imports from Singapore to US
  • AWS partners with Humain to build AI Zone in Saudi Arabia

Follow Us

Facebook
Twitter
LinkedIn Youtube Instagram RSS

Newsletter

  • About Us
  • Editorial
  • Newsletter
  • Contact
  • Privacy Policy
  • Cyber Media Guidelines
  • Disclaimer
  • SOP Perlindungan Wartawan

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie settingsACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT
No Result
View All Result
  • Home
  • News
  • Technology
  • Industry
  • GovAct
  • Events
  • Whats on
  • Expert Talk
  • Insight
  • Sustainability
  • Newsletter
  • Press Releases
  • Login
  • Sign Up

© 2024 Heaptalk.com