E294 - Nabanita De, Founder and CEO of PrivacyLicense.ai

The Data Diva E294 - Nabanita De and Debbie Reynolds (44 minutes)
Debbie Reynolds

[00:00] Debbie Reynolds: The personal views expressed by our podcast guests are their own and are not legal advice or official statements by their organizations.

[00:11] Hello, my name is Debbie Reynolds. They call me the Data Diva. This is the Data Diva Talks Privacy podcast where we discuss data privacy issues with industry leaders around the world with information the businesses need to know.

[00:24] Now I have a very special guest on the show Nabanita De.

[00:29] She is the founder and CEO of Privacy License AI and I have a couple things that I have to share about Nabanita before we start.

[00:37] She was recognized by Fast Company, Google and Forbes for her work in privacy and AI.

[00:44] Former security and natural language processing leader at Uber, previously worked in AI at Microsoft Research.

[00:53] She has worked at building at the intersection of privacy, AI governance and digital rights and I'm super happy to have her here. She's also a four time founder with a broad background in tech innovation and someone that I've been connected to for years on LinkedIn and we never had a chance to chat yet until now.

[01:12] So thank you for being here.

[01:14] Nabanita De: Thank you for having me. This is really wonderful and I seeing your work. So I'm really excited to collaborate with you and share on an important topic today.

[01:25] Debbie Reynolds: Absolutely. Well as I said, I'm very excited to have you here for a number of reasons we have been connected on LinkedIn, I think we've had some on LinkedIn over the years.

[01:35] But I'm really fascinated by your work with Privacy License and why you thought this was.

[01:43] First of all it's important work but I want people to understand anything that you want to share about your background. But then also about this because I saw an article first of all I see all the work that you do but I found an article that was really interesting where it really described your work around Privacy License about being a machine readable privacy product.

[02:04] But yeah, share what you'd like to share about you yourself and your product.

[02:10] Nabanita De: Absolutely. So my background like you mentioned, I worked in tech across Microsoft, Uber and FinTech. Almost 10 plus years in the privacy and AI spaces where I had an impact of saving over $5 billion in compliance costs.

[02:30] I just ended up seeing privacy how it's done across all of these different companies and sectors and that sort of led me to starting Privacy License and we are building the world's first privacy operating system for the AI era.

[02:46] Our flagship product,

[02:48] AI Privacy License is a machine readable,

[02:51] legally enforceable data governance protocol that empowers creators to set machine readable, enforceable rules that AI come on how AI companies can use their content and on the AI company side, they are able to comply with the EU AI act code of practice.

[03:09] And our other product privacy website Auditor IT empowers white coded startups to ship privacy compliant apps within minutes.

[03:20] And we have several customers across fintech, security,

[03:25] wearable companies,

[03:26] et cetera.

[03:27] Debbie Reynolds: Well, the way that you say it, I think people may not understand how big of an issue this is.

[03:34] Right. And how hard this is. This is an extremely hard problem.

[03:38] And I want to give my opinion, I want your thoughts. So first of all, the Internet was really not built for privacy. Put it like that. And so a lot of the things that we see in innovation and technology don't really take privacy in mind.

[03:52] But what we have now is a rapidly evolving landscape where things technology is escalating in terms of how it's being built.

[04:02] We have consumer or human sentiment where people really care or want to know more about what their data is doing.

[04:10] And then we have companies that just want to do business and want to do it with the least amount of cost and friction. But what are your thoughts?

[04:19] Nabanita De: Oh, I totally agree, Absolutely. The old Internet that we saw,

[04:24] for example,

[04:25] was run by protocols like robots txt,

[04:29] or terms of service that we see on every website which tells the bots and the crawlers and how AI can interact with that content.

[04:39] Essentially like saying that yes, you're allowed to scrape, we know you're not allowed to scrape, so it's a binary yes or no protocol.

[04:47] But today we see that Internet is the AI bots are reading and consuming everything without giving creators and without any compensation or attribution. Essentially like if you write an article today or if you create a blog, create an art, create music,

[05:05] create any sort of creative work and you put it out on the Internet.

[05:09] The previous Internet that you saw today was Google Logs, like the Google search site, right, where you would get business because it would link you into your SEO and go into your page.

[05:21] But today people have stopped going on Google to search. They go on things like ChatGPT or Claude, for example, to search or gain more information.

[05:32] And the Internet, those particular interfaces are built by scraping on billions and billions of web pages and creative works where compensation and attribution necessarily have not been given to those creators who have published those knowledge from years of experience.

[05:51] So that is where the problem of AI copyright becomes so important. Where on one hand creators are not getting compensation and attribution, on the other hand, AI companies do not have legal clarity on what kind of data they can train, on what kind of data they can scrape, on what kind of data essentially is allowed and they need that legal surety.

[06:13] For example, the EU AI act code of practice talks about that machine readable rights needs to be respected by AI companies if creators are setting them. So AI companies fall under regulatory pressure to respect these machine readable rights and there wasn't a good way to do that.

[06:33] And that is sort of where our work essentially becomes so important in AI Privacy License, where we are building this protocol that creators are able to set rules saying that either how exactly AI should be interacting with their content and on the other hand AI companies are able to read and comply essentially,

[06:54] and it's very important.

[06:56] Debbie Reynolds: So when I talk to people about this,

[06:58] a lot of people say, well, all we need are better laws. But laws and regulations, they're not a shield.

[07:05] So they don't prevent things from happening right in technology.

[07:09] And you mentioned Robots Txt.

[07:12] I've been raging about that forever because I actually used to call Robots XT the original fig leaf of the Internet.

[07:21] Right. So, so that was like, oh well you know, you can use our stuff but like don't do this to our site. So it was really nothing preventing it. But we were trying to say this is the, the agreement that the creators had with people who are looking at their site and their information on the website.

[07:38] But now what we're seeing, as you say,

[07:41] we have technologies that don't aren't thinking about Roblox Txt, even though they say that they are. And we have technologies that can scrape data indiscriminately regardless of the creator or their particular rights.

[07:55] But why do you think,

[07:58] you know, I think this is an incredibly important work that you're doing because we really have not up to this point had a middle solution.

[08:09] This is a technical problem in my view. And so we didn't really have a technical solution to that. But tell me just your thoughts us about the importance of using technology to solve this problem.

[08:20] Nabanita De: Absolutely.

[08:22] So when you think of like web pages, like a lot of companies,

[08:27] creators started writing on their terms of use, for example saying that hey, please do not scrape online content.

[08:33] But imagine the scale that AI trains millions and millions of web pages. It's very almost impossible for manual review to happen where imagine the team, the AI companies, lawyers reviewing manually every website right to say is this allowed to train or not.

[08:51] That is an impossible problem to have. And every wrong crawl would lead them to like lawsuits that we are seeing in the industry today. Like for example anthropic paid $2 billion to for copyright lawsuits.

[09:08] And then we have also seen New York Times versus OpenAI lawsuits.

[09:12] So we've seen these lawsuits happen because of like AI companies not having the legal clarity on what could be trained on what could be appropriately sourced data to train on essentially.

[09:26] And doing it manually makes it extremely difficult. So through technology you can automate that. And that is what we are trying to solve that. Imagine like a set of rules that these companies or creators could set that these AI companies in, through an automation fashion will be able to read those rules and put into their training their crawling pipelines.

[09:52] And essentially that tells them this content is allowed. For example, maybe commercial training is allowed or non commercial training is needed, or maybe you're not allowed to train on it, or maybe if you're training on it, you need to attribute it in a certain way or you need to compensate this creator a certain way.

[10:11] Or like this is the jurisdiction it is coming from. This is the kind of data there is. Like having that level of clarity is gold for these AI companies on the other end.

[10:22] And that can only happen through technology where we see like for example, likes of Taylor Swift who have money to hire creator, to hire lawyers to file trademarks for their likeness,

[10:37] essentially. But for small creators, maybe they do not have that money to go to trademark law to do all of those things. And it becomes really difficult if we just rely on law, because I think technology can really support, alongside with the legal terms to ensure that creators are protected as well as AI companies are also training on sourceable data so that they can make better models.

[11:03] Debbie Reynolds: Excellent. Well, tell me a little bit more about the machine readable part. So this is my thought about it. So when let's say AI or some tool encounters someone's website,

[11:16] there may be things that they may not know about the product or about the website or about the creator. To know, like I would imagine that the tool or product will have to know like the jurisdiction,

[11:28] you know, the laws that will be applied,

[11:31] what a creator would want or not want to share about their information.

[11:37] And so how does that information get imbued into the product? Is that something your product adds or is that something you work with the company to be able to determine?

[11:48] Nabanita De: So essentially the creator sets these information like this is, for example, maybe a creator wrote a website,

[11:56] they can say who is the owner,

[11:58] which jurisdiction it came from, what kind of data there is in that website,

[12:03] how should AI train on it,

[12:05] what kind of purpose it can be used for,

[12:08] and essentially create this detailed metadata that gets attached to the content and you can imagine it travels with the content through the AI ecosystem. So when an AI scraper or AI company or AI bot comes in contact with this content,

[12:27] this set of rules that the creator has set tells the AI company that this is how you can interact with this content, essentially. So that metadata becomes the guiding force for this AI company when they are coming in contact with the content, essentially.

[12:46] Debbie Reynolds: Very good. I want your thoughts about.

[12:48] You had mentioned the OpenAI New York Times lawsuit. And so what I have been seeing and I want your thoughts.

[12:57] As a result of that and other legal things around copyright and intellectual property, we're seeing a lot of the bigger publishers like New York Times and all these other magazines or newspapers, they're trying to put a lot of their content like behind paywalls so that they're not open on the Internet.

[13:20] But I think,

[13:21] I'm not sure that's the best solution. Obviously that wouldn't work for like maybe a smaller creator. But just what are your thoughts? I feel like the Internet is already becoming like more fragmented as a result of some of these things.

[13:35] But how is it. And I want your thoughts about your product, how it can help people be more open. Because I feel like a lot of that move towards trying to either not put stuff on the Internet or wall it off from the Internet is trying to answer this question where you're saying you can have it open,

[13:53] but you can have products to help you manage those rules.

[13:57] Nabanita De: Right? So I believe that block all bots or allow all bots the binary thought processes that had existed before due to previous protocols like robots Txt essentially it's not really helps the business or the Internet where AI where.

[14:15] What I understand is that these businesses, at the end of the day, they want to be cited by AI. They want to be.

[14:22] They want to get business from AI. And on the AI side, AI companies want legally sourceable data to train on.

[14:31] So either putting it behind,

[14:33] like blocking it out or making the Internet closed is not the solution. So solutions like AI Privacy license, essentially what it does is it keeps the Internet open, but creates a set of rules that the bots can.

[14:48] So instead of thinking of it as it will block the bots,

[14:53] which it can, that is what the creator chooses.

[14:56] The creator can also say something like I am allowing you to train on my content, but please give me attribution in XYZ way or I'm allowing you to train on my thing, but please compensate me in a certain way and I'm allowing to train on this, but please train my content only for educational purposes and not for commercial use.

[15:14] So these kind of granular rules that today AI can read and understand becomes possible because of the technology shift that we are seeing in the market.

[15:24] That AI is able to read and understand complex legal terms and make sense of it.

[15:29] And so a protocol that can sort of attach with the content and create these clear set of guidelines, you can think of it this way, would really empower the Internet.

[15:40] Like my big vision is that every site just like how they have a privacy policy, they will have a AI privacy license. The goal behind that is just to tell how AI should be interacting with with that site or with their content so that essentially both the Internet can stay open as well as creators get compensated and as well as AI companies are being able to compliant with these AI governance regulations like the EU AI act,

[16:09] which is coming out essentially.

[16:11] Debbie Reynolds: And tell me a little bit about the EU AI act and its importance in your view, how it plays into not just your product, but just the way people are viewing artificial intelligence.

[16:23] And in my view I'm very interested.

[16:27] The thing that stood out to me about the EU AI act was their focus on harm.

[16:33] Where I feel a lot of times we don't think about that or that's not talked about as much, where we talk about innovation for the sake of innovation and speed for the sake of speed.

[16:43] But really bringing it down to that human element I thought was really interesting. But what are your thoughts?

[16:48] Nabanita De: Yeah, I think so. That my. The.

[16:52] The area of the UEI act that is relevant to my work is the UAI code of practice copyright 1.3, which is the Copyright chapters rules which says that every creator when they set a machine readable right,

[17:07] the AI company should respect those machine readable rights and based on that should train their content or not essentially so and machinery level protocols that would be identified would be respected by the EU AI act body.

[17:24] So it makes sense that the EU AI act is supporting creators today because creators are not getting compensated, they are not getting attribution and they do not have a good way to sort of set rules on how AI should be using their content essentially.

[17:42] And so the UAI act taking a stand for these creators through this copyright chapters rules,

[17:50] to me it is well aligned with the incentives of the Internet as well. The way I see it is that in the future if creators no longer continue to be compensated, they do not have incentives to create anything.

[18:04] And then AI would be training on AI slop and then AI models would continue to deteriorate maybe or not become as like they might not get the right data that is needed to as we are closer and closer to AGI essentially.

[18:21] So at the end of the day these AI companies what make, what really differentiates them is the quality of data that they are training on.

[18:29] And previously they got that good quality data because AI did not exist on this level and creators were really creating the stuff. But if creators stop creating because they are not getting compensated, then AI company models would become less stronger or more full of AI slop.

[18:48] So the thought process there is that it incentivize the entire ecosystem to adopt a solution that the EU AI act is pointing towards is like a machine readable rules that creators are able to set, AI companies able to read and comply and both parties win together.

[19:05] I feel like that is accelerating towards long term innovation. Like yeah, in the short term maybe creating and adding to these rules can slightly slow down the processes, but in the long term it can reduce these lawsuits.

[19:21] That's happening. Essentially everyone has a path to move forward together in a partnership model between creators and AI companies and overall the Internet wins together.

[19:31] So for me, I think from a long term innovation point of view the incentives are incentivized across the ecosystems and shared.

[19:39] Debbie Reynolds: I definitely see the incentive for someone who wants to protect their information and then also forward thinking organizations that see this as the best or a very good way to solve the problem that was very expensive to solve.

[19:56] Like you say, like either you know, reactive in a lawsuit or maybe proactively where they had spent a lot of more human time and they being able to do this.

[20:08] But do you see that and I agree with you, the incentive about having good data? Because if people don't trust systems, they don't put their data in systems or they don't put quality data into systems.

[20:23] And so I think that does impact the AI companies or the companies that are doing this. But are you seeing a strong incentive for companies to adopt this so that they are better at reading the signals that are being created from creators about how they want their data to be shared or used in AI systems?

[20:46] Nabanita De: I mean the major companies have signed the EU AI act code of practice like Google, Amazon,

[20:55] Anthropic,

[20:57] Cohere and multiple other companies. Like I think around 20 major AI teams,

[21:03] tech companies have signed the code of practice saying that they would respect these machine deliverable rules that are selected by the EU AI act code of practice. So I believe that there are like, I believe that the leadership in these companies are also recognizing the need for shared rules or how the business of the Internet is transforming slowly from the old SEO to GEO today or essentially how they would be working with creators or they would be

[21:35] working with regulatory bodies like the EU AI app to respect that. So I think there is the strong cohesion between regulatory pressure,

[21:44] economic incentives for these AI companies, especially in regards to access to high quality data. To adopt something like this and in general wanting to build better models is where I think those were some of the common themes that led them to sign something like this to adopt it.

[22:04] So yeah, that is where my thought process would lie essentially.

[22:07] Debbie Reynolds: Yeah.

[22:08] Now in terms of machine readable rights that are in the European Union,

[22:14] we don't yet have those types of rights here yet.

[22:19] But I think we also have things that are emerging like the global privacy control,

[22:26] which not the same thing, but the idea is somewhat similar, meaning that a person or an entity can communicate their preferences about different things. And I can see that extending to maybe machine readable rights as it relates to what we're talking about like AI and copyright and things like that.

[22:48] But what are your thoughts?

[22:50] Nabanita De: Oh absolutely, I totally agree with you that it can extend.

[22:55] Especially because US is seeing so many copyright lawsuits that are happening against these AI companies.

[23:03] Essentially I think there would be laws and rules and we have predominantly followed the EU essentially as like before there was GDPR and that led to so many different privacy regulations that are happening across multiple states of the U.S.

[23:19] so I believe with these machine readable rights or the copyright predominancy that is set by eu, we might see similar themes occurring in the US as well in the upcoming years essentially.

[23:33] Debbie Reynolds: I think so. I know from when general Data protection regulation came out and the CCPA in California, we saw some companies say, well, even though those rights pertain to the EU or pertain to people in California, for some companies they said for certain things they will extend those things to like other people in the US So I'm wondering if this idea around machine readable rights or machine readable privacy products,

[24:11] if companies are maybe using them successfully in these other jurisdictions, do you feel as though there may be incentive for them to expand that to maybe different states in the US that may not even have of regulation right now around that?

[24:28] Nabanita De: Yeah, I mean we used to for example the Grammys,

[24:32] they are essentially advocating for something like this.

[24:37] Essentially like to protect the the singers likeness.

[24:43] And essentially a lot of different bodies that I've seen in the US they are advocating for something like this. Like even Hollywood Science Astra, they are also advocating for something like this.

[24:57] So a lot of the different like in the creator industry, this is a huge issue essentially.

[25:03] I think it's like Everybody is pro AI, but they want to get compensated or attributed or they want to protect their intellectual property. And obviously it's not possible to follow your trademark on where it is getting ingested with an AI in traditional ways.

[25:20] So there needs to be regulations and there needs to be things in place for it to happen.

[25:26] So I can definitely see how like in different states, the US especially like I think this problem goes into so many different spaces. Like for example deep fakes is a direct sister space of this essentially that so many deep fakes are happening because there again isn't clear attribution or clear compensation.

[25:49] So actors for example are getting deep fakes because their likeness, their voice that is being stolen and leveraged into deepfakes.

[25:59] So this problem in general goes into so many different spaces and I can see how essentially so many different organizations are currently advocating for regulations or rules that can that could happen on a statewide or a countrywide level in the US that could change that.

[26:20] And yeah like Grammys definitely have seen them very actively advocating with no train act and fakes act. These are the two act they're trying to pass on a bipartisan level.

[26:32] So definitely like lot of different organizations are advocating currency.

[26:38] Debbie Reynolds: What's happening in privacy or data right now that's concerning you the most?

[26:44] Nabanita De: I think another experience that I really think that we see a lot of new tech coming is like whiteboarding. Like everyone is whiteboarding right now,

[26:54] but there's no whiteboard privacy right in the sense like how do you ensure that when you're wipe coding you're following the privacy regulations appropriately.

[27:05] Whatever you're creating in your white coding way, it follows complies with the different privacy regulations that is entirely missing. So that is a huge space. That is another space that we are innovating in one of our products,

[27:19] AI privacy or privacy website auditor. It essentially ensures that when people are wipe coding and creating websites they are in compliance with these privacy regulations by finding privacy issues on the website that they might be creating that they might not know about.

[27:36] So I think this is space we'll see a lot of innovation in that. Like every day millions of websites, millions of apps are being generated and created by everybody.

[27:49] Just like how white coding is affecting security.

[27:52] We are seeing all of these different security related incidents. We'll also see a huge number of of privacy related incidents come up of these apps that are being created without thinking of privacy by design, without thinking about privacy compliance.

[28:08] So that is another space that I think we'll see a lot in the next few months.

[28:13] Debbie Reynolds: Essentially one of the things that I think that comes up a lot, and I want your thoughts. And this is in websites and apps. So let's say a developer has created a website for a restaurant,

[28:26] right? And it has different marketing and things on that website.

[28:30] But then now they want to create a. They want to use that work that they use for the restaurant website for a hospital website.

[28:37] And we know that because medical health, especially in a patient provider situation, is protected in a different way. Some of the things that were okay for you to do,

[28:48] maybe on a restaurant website aren't okay in the hospital website. And we're seeing this play out a lot in the mobile app space and also in the website space where companies are getting into trouble because like, for example, someone fills out a form on a hospital website or medical website,

[29:06] that data has to be treated differently than if someone was making an order at a restaurant. But what are your thoughts?

[29:13] Nabanita De: Absolutely. I think, like, purpose limitation is a huge area within privacy where you have to absolutely use the data that you have collected for the purpose that it is collected for and not for anything beyond,

[29:27] like data minimization.

[29:29] Purpose limitation go hand in hand.

[29:31] And different sorts of data, different regulations apply, like you mentioned, for medical. HIPAA would apply for restart. Different regulations would apply, essentially.

[29:43] And I think it comes down to, like I mentioned, a lot of the people are white coding today. They do not have privacy knowledge. They do not have privacy lawyers, essentially.

[29:55] I think I saw in a study recently that for every 500 developer, there is one privacy lawyer. So it's not possible for that one privacy lawyer to essentially ensure that every developer is shipping code or building websites or building apps that are privacy compliant.

[30:16] And that is where we'll see a lot of innovation, I think, so, in the next few months, where essentially privacy is also something that could be factored in, essentially when people are coding and building and creating new technologies and new apps and new websites, essentially.

[30:35] Debbie Reynolds: There was a story in the news about a company gotten very popular, very famous recently for all their work that they were doing in AI, I think in the employment space, if I'm not mistaken.

[30:48] But then at the same time, they had this huge data breach that they're dealing with right now, where some of the data that they have collected about people, I guess, was like information like maybe their people's Social Security number, whatever information people were using to apply for jobs, things like that.

[31:06] And so they got into trouble with a data breach. And so to me, that's the,

[31:13] to me, that story is an example of the imbalance there,

[31:17] where they're really doing well or doubling down on their use of artificial intelligence, which is nothing wrong with that. But then there's this gap on the privacy part.

[31:28] And for them, I think it lowered their price, I believe, for their company and it's creating some regulatory challenges for them. But I think what we're saying is that there are ways that you can approach data in a way that creates more safety for your company then also the consumers.

[31:48] But what do you think?

[31:50] Nabanita De: Yeah, absolutely. I think security is a huge concern in the whiteboarding era, where, for example, if you have sensitive information,

[31:59] you have to ensure that these encrypted address stand at motion and data minimization and you are collecting it for the purpose of.

[32:08] There are like all of these privacy by design principles that apply.

[32:11] And I think not factoring privacy by design while or not doing the right PIAs and DPIAs before building those products that we are seeing, essentially, I think that leads to all of these wrong architectural choices that leads to data breaches.

[32:29] So I think it's all interconnected, privacy and security so interconnected in different ways that I definitely see there's a need for the right security measures as well as privacy measures, especially when building apps or creating.

[32:46] And I have heard of like, I think you're talking about Marker.

[32:51] I've heard about their.

[32:53] Essentially the data breach that went through. And it's not just them. There are so many other companies as well that I'm hearing data breaches by recently.

[33:05] So definitely I think it's important to think about these privacy. The privacy compliance exists. So essentially to protect,

[33:16] safeguard and ensure that sensitive data is collected, stored, shared and cared for appropriately. And when these compliance, like privacy by design is not taken seriously, we see all of these different issues that definitely happen.

[33:34] Debbie Reynolds: I think to me the message is,

[33:37] you know, it's very expensive to have a breach and it's very expensive to have to go through that process. But if you can avoid that or minimize that at the beginning, like you're talking about privacy by design and thinking about it on the onset, I think helps companies be able to move fast because these types of things do slow companies down.

[33:57] But I want your thoughts about.

[34:00] We're talking about vive coding and we didn't yet talk about AI agents. But the thing that concerns me about data in companies is that a lot of the privacy issues that come up have to do with context in which the data is managed or handled or accessed.

[34:18] And so some of the things that concern me about, let's say for instance, two different departments were handling data and they're handling certain data about people, but not everything.

[34:29] So maybe in isolation a particular data point about someone would not be harmful.

[34:37] But now let's say the organization,

[34:39] they're taking data from different parts of the company,

[34:43] they're combining the data together.

[34:45] And then now this data is more risky because it's more identifiable, personally identifiable to the person.

[34:52] But what are your thoughts?

[34:54] Nabanita De: Yeah, no, absolutely. I think that's where your least privilege comes into play, where essentially who has access to the data and for what purpose,

[35:05] essentially in reviewing that on a frequent basis is important.

[35:10] And then tagging the data appropriately with the right data classification tags is also important.

[35:16] So that what purpose and what kind of sensitivity it is used for.

[35:21] If a department is thinking of merging data, then those tags needs to be updated. The right access needs to be updated. Like these privilege again would pervade here as well to ensure that who has access to it is the people who really need access to it.

[35:38] And it's also reviewed on a daily basis and there's a time to live assigned to it. So there's so many technicalities that that needs to go into place to ensure that the data is protected and the data is used for the purpose that it has been used for.

[35:54] And if the purpose is like if the data is being merged and the purpose is changing, then getting consent again from the users is also required. Because we have seen so many regulatory issues happen for companies where they took the data for a certain purpose and then did not ask the user for consent for a different purpose.

[36:14] And so then the regulators find them. So essentially asking users again for consent for the new area of data that will be used for will also become important. There are so many spaces that needs to align.

[36:29] And again it comes down to having the right privacy knowledge and teams doing as processes change or as any app is changing,

[36:39] doing PIAS privacy Impact Assessments or DPIAS appropriately can alleviate some of these issues. And also doing privacy design reviews before making these design decisions can ultimately help.

[36:53] Debbie Reynolds: I want your thoughts about metadata. I did a video about this that's coming out soon.

[36:59] But I think metadata is going to be very important in the future because like things with deep fakes like we're seeing like some in the camera industry, we're seeing some digital camera make their cameras in a way that if you take a picture,

[37:17] certain metadata about the creation of that photograph is kept in the photograph, but it can't be changed. That's a new thing I'm seeing. But I think metadata fields and the information about the data would have to expand in order to enable the things like you're doing with like machine readable privacy.

[37:38] But I think,

[37:39] you know, the purpose, like the more information about the data I think would need to be imbued in the data to make it more machine readable. But what do you think?

[37:51] Nabanita De: Yeah, absolutely. And that's exactly what we are doing at AI Privacy License. We are embedding metadata into the content, essentially. And it is very granular metadata. So it has information, for example, like is it AI generated?

[38:06] What kind of purpose, like can it be used for? What kind of data is what kind of data classification? There is data category there is.

[38:14] Then what kind of data can it train for? Is it commercial use allowed? Is educational? Is it for commercial or educational?

[38:23] Can it be trained on AI content or not? So there are all of these granular metadata that can really help in terms of defining the content.

[38:34] Like we image for example in cameras. There could be things like, like you mentioned, the origin of that can really help.

[38:43] Even we are seeing watermarking. The technologies like watermarking, if something is generated by AI can be useful especially in these fakes.

[38:53] So this is a evolving space, I believe,

[38:57] where we would continue to see that this watermarking technologies happen not only in images, but also I think certain AI generating devices are also watermarking text that is generated by AI essentially.

[39:14] So like embedding those metadata to help ensure that people know that this was an AI generated text, was written by humans.

[39:24] So we definitely seeing a lot of these things,

[39:28] these innovations happen and really important in different contexts of issues that we are seeing, be it deep fakes, be it copyright, be it. Yeah, different spaces for sure.

[39:42] Debbie Reynolds: Yeah. Well, this is so fascinating. I think you're definitely way ahead of everyone else on thinking through this issue is really cool.

[39:51] But if it were the world according to you, Nabanita, and we did everything you said, what would be your wish for privacy anywhere in the world?

[39:59] Whether that be regulation,

[40:02] technology or human

[40:03] Nabanita De: behavior here I think privacy at the end of the day is a human right. Like it's defined as human with a human right. And I think every person should be able to determine how they like, should be able to use the Internet, use AI like, use, like leverage the innovation.

[40:20] But at the same time they should be able to protect their content, they should be able to have protect the intellectual property, they should be able to protect their data, they should have transparency in terms of how their data is being collected, used,

[40:35] shared by companies.

[40:37] And I think a world where everything is done Appropriately, like privacy becomes like really the brand differentiate is like embedded in every company's mission.

[40:47] That privacy should be at most like number one important, especially when it comes to user data. It comes to users,

[40:56] I think people would have clear understanding on how their data would be used. Like today, for example, all of the models are black box models.

[41:03] In a world where privacy is done perfectly,

[41:06] there'll be clear understanding of what kind of data was ingested within these months, for example,

[41:12] or how a company, for example, is using my data or how is privacy being protected. I think privacy policies are a starting point, but we are seeing a lot more innovation happen in that space.

[41:26] So definitely looking forward to a world where privacy is like I mentioned, embedded in the mission statement of every company associated.

[41:35] Debbie Reynolds: I love that, I love that, I love that wish. I think that's a great one. But then also it's something that's possible.

[41:43] So I would love to see that innovation.

[41:45] But thank you so much, Navneeta for being on the show. I really appreciate it and I support your work.

[41:51] Tell us how people can see your product, get involved with your work work.

[41:56] Nabanita De: Absolutely. So you can find us at privacylicense AI, which is our official website. And if you're intrigued about the AI Privacy license initiative that we are doing, you can follow us@aiprivacylicense.com and you can generate a license there.

[42:12] Generating a license is free. And if you are an AI company who is listening to this podcast,

[42:18] we have released open source libraries that you can embed in your training crawling pipelines essentially to read these machine readable rights. And you're able to do that as well. Completely open source as well.

[42:31] So the innovation is there, completely open source, end to end.

[42:35] Please try it out, give us feedback, reach out to me on LinkedIn and tell me how you're using these technologies. And let's work together and create a world where creators get compensated as well as AI companies at legal clarity.

[42:52] Debbie Reynolds: Excellent. Well, thank you again so much for being on the show again. I am great fan and follower of your work and it's a pleasure for me to be able to share your work with the audience.

[43:04] Nabanita De: Thank you so much again. I really appreciate you having me on the show and thank you for your kind words as well.

[43:10] Debbie Reynolds: Oh, thank you, thank you. Well, we'll talk soon.

[43:13] Nabanita De: Sounds good.

[43:14] Debbie Reynolds: All right, thank you.

Next
Next

E293 -Veronica Canton, Partner at Pierson Ferdinand LLP and Chief Vision Officer of Optimized Leverage