Computers News Phones Tecnology Uncategorized

Microsoft explores a method of credit for contributing to artificial intelligence training data

Microsoft launches a research project to estimate the impact of specific training examples on the text and pictures and other types of media created by artificial intelligence models.

This is in the list of jobs dating back to December and recently recycled on LinkedIn.

According to the inclusion, which is looking for a trainee in the research, the project will try to prove that models can be trained in a way that the impact of specific data can be estimated – for example, on their outputs “efficiently and agreeing to that.”

“The current structure of the nerve network is not transparent in terms of providing sources for its generations, and there are (…) good reasons to change this,” says the list. ((One of them,) incentives, confession, and may push people who contribute to some valuable data for unexpected types of models that we will want in the future, assuming that the future will maintain us mainly. “

The texts of Amnesty International, symbol, photos, videos and song generators are at the center of a number of IP lawsuits against artificial intelligence companies. Often, these companies train their models on huge quantities of data from public websites, some of which protect copyright. Many companies argue that a fair use doctrine protects the timing of data and training. But designs – from artists to programmers to authors – do not agree largely.

Microsoft itself faces at least two legal challenges of copyright.

The New York Times filed a lawsuit against the technology giant and its collaborator at some point, in December, accusing the two companies of violating copyright in the Times by publishing trained models on millions of articles. Many program developers filed a lawsuit against Microsoft, claiming that the Github Copilot Ai was trained to the company illegally using its protected business.

It is said that the new research efforts of Microsoft, which the list describes as “training source at the time of training”, is said to drink from Jaron Lanier, a prominent technician and a multidisciplinary world in Microsoft Research. In an editorial in April 2023 in New Yorker, Lenner wrote about the concept of “data dignity”, which means that it means linking “digital things” to “people who want to be known as”.

“The data design approach will follow the most unique contributors and influence when it provides a large valuable model,” Lanner wrote. “For example, if I asked a model about” a moving movie for my children in a world of cats in talking about oil on an adventure, some of the main painters are in oil, cat photographs, vocal actors, and the book-or their areas-may be calculated until they are paid. “

There, not for nothing, already many companies trying to do so. The developer of the artificial intelligence model, which recently raised $ 40 million in investment capital, claims to compensate for “compensation” of data owners according to the “total effect”. Adobe and Shutterstock also give regular payments to data groups, although the exact payments are inclined to be transparent.

A few large laboratories have created individual payment programs for shareholders outside the licensing agreements with publishers, platforms and data mediators. Instead, they have provided means of copyright “cancellation of subscription” in training. But some of these subscription cancels are exhausting, and only applies to future models-not pre-training.

Of course, the Microsoft project may reach slightly more than evidence of the concept. There is a precedent for that. Once again in May, Openai said it is developing a similar technology that would allow creators to determine how to include their work in training data – or exclude them from – training data. But about a year later, you did not see the tool after daylight, and it was often not seen as an internal priority.

Microsoft may also try to “wash ethics” here – or direct organizational decisions and/or an expression trial of its actions in artificial intelligence.

However, the company is investigating the noticeable training data in light of other situations that have been recently expressed by AI Labs. Several major laboratories, including Google and Openai, have published political documents recommending that the Trump administration weakens the protection of copyright in terms of its connection with the development of artificial intelligence. Openai explicitly called on the United States government to record the fair use of typical training, which is arguing that he will free the developers from exhausting restrictions.

Microsoft did not immediately respond to a request for comment.

Source link