[WG-InfoSharing] schema.org and semantic web topics was Re: W3C Data Privacy Vocabulary - Consent Receipt Inputs

Mark @ OC mark at openconsent.com
Sat Jun 22 12:47:26 UTC 2019

Nice James, 

This is really interesting stuff - thanks for sharing this 'Model Agreement for Sharing the Data and Benefits of Public Health Surveillance'.  
(Not only because health and surveillance are my favourite topics :-) 

It seems that if these were bound to the privacy law defined roles, then the rights could be obligations that can then be modelled with the DPV.  Does this makes sense in the context of a contract framework ? 

Ultimately, does it makes sense to explore privacy rights as applications that can use this type of framework to make new things.  For example,  Privacy Agreements from people to vendors. 

A privacy agreement (using rights and their enforcement) could then connect (or use) the UMA legal model <https://kantarainitiative.org/confluence/display/uma/UMA+Legal> for licensing data flows from people directly to Enterprise.     (Which has in the WG Mission)
Focus on GDPR-related toolkits first. A toolkit could be anything that helps use or leverage an existing piece of legislation or framework, such as an SDK, a checklist, consent receipt <http://kantarainitiative.org/confluence/display/infosharing/Consent+Receipt+Specification> templates or profiles, or a set of CommonAccord <http://commonaccord.org/> text, and could be related to the GDPR itself, the EU-U.S. Privacy Shield, BCRs, and so on.

Have you had any thoughts in this direction? 

E.g. - From this part - : 

Relevant legal frameworks for sharing data
Identify and describe any known relevant legal frameworks applicable to either or both Parties that will enable the sharing of data.



> On 21 Jun 2019, at 17:06, James Hazard <james.g.hazard at gmail.com> wrote:
> I understand that to be right about schema.org <http://schema.org/> - designed for semantic tagging of web pages.
> The thesis of CommonAccord is that common vocabularies can emerge from a “general purpose semantic tool,” if the tool is open enough to permit people to find their own solutions.  That’s also part of what we’re trying to demonstrate in “Modelling the EU Economy as an Ecosystem of Contracts.” 
> Web pages can be self-tagging if managed in source rather than as blobs:
> http://www.commonaccord.org/index.php?action=doc&file=G/ChathamHouseOrg/DataSharing/Demo/Acme-Quake.md#Activity-Purpose.sec <http://www.commonaccord.org/index.php?action=doc&file=G/ChathamHouseOrg/DataSharing/Demo/Acme-Quake.md#Activity-Purpose.sec>
> Slow to load because the document is big and because the parser is from 2014.
>> On Jun 21, 2019, at 8:37 AM, Andrew Hughes <andrewhughes3000 at gmail.com <mailto:andrewhughes3000 at gmail.com>> wrote:
>> Forking the thread...
>> If I understand correctly, schema.org <http://schema.org/> was created to assist web search tools to parse web pages better - which constrains its usefulness as a general purpose semantic tool (if it is even  possible to make that kind of thing)
>> Andrew Hughes CISM CISSP 
>> In Turn Information Management Consulting
>> o  +1 650.209.7542
>> m +1 250.888.9474
>> 5043 Del Monte Ave., Victoria, BC V8Y 1W9
>> AndrewHughes3000 at gmail.com <mailto:AndrewHughes3000 at gmail.com> 
>> https://www.linkedin.com/in/andrew-hughes-682058a <https://www.linkedin.com/in/andrew-hughes-682058a>
>> Digital Identity | International Standards | Information Security 
>> On Fri, Jun 21, 2019 at 8:34 AM James Hazard <james.g.hazard at gmail.com <mailto:james.g.hazard at gmail.com>> wrote:
>> A few quick thoughts, in line on Andrew’s list:
>>> On Fri, Jun 21, 2019, 7:32 AM Andrew Hughes <andrewhughes3000 at gmail.com <mailto:andrewhughes3000 at gmail.com>> wrote:
>>> Cool.
>>> Some more questions, if you don't mind...
>>> A) can we treat the list of terms in the vocabulary as exactly that: a controlled word list?
>> Most of legal documentation can be done as a “controlled list” of words, phrases, document forms.  The problem is who is in “control.”  That is, a closed system will always be incomplete and therefore force edge cases and diversity into a standard vocabularies and conformity.  It can centralize “control” of the vocabulary and hence the thoughts.  Prototype inheritance enables “permissionless” variations at the edge. 
>>> B) what is supposed to happen when a word has more than one definition? Or is the vocabulary not about definitions but rather about "list of words”?
>>> C) regarding the RDF - if one were to use, for example, JSON-LD and refer to schema.org <http://schema.org/> context and also this RDF - should it work? (Recognizing that this question is really stretching the limits of my knowledge on semantic web-ish topics - so please rephrase the question if needed)
>> I am far from an expert on this subject, but I found that RDF over-solves the problem of managing vocabularies.  Schema.org <http://schema.org/> is great, but also too limited, so it either needs a way to fork and build, or one needs to start otherwise and connect to it.
>> JSON-LD seems really useful, though I’ve found that you only need a very limited set to do most of the work.
>> http://www.commonaccord.org/index.php?action=json&file=Wx/org/schema/Person.md <http://www.commonaccord.org/index.php?action=json&file=Wx/org/schema/Person.md>
>>> In the most simplistic scenario, does this usage sound right:
>>> - I am a Data Controller designing my Consent Receipt data structure
>>> - in this scenario, I have only one processing purpose
>>> - in order to choose which Purpose for Data Processing to include in the design, I choose the appropriate Purpose word from the DPV document. 
>>> - therefore I have confidence that other Data Controllers and Data Processors who also use the DPV will know what that specific Purpose word means when they see it in the Consent Receipt output file and can act accordingly
>>> On Fri, Jun 21, 2019 at 3:10 AM Harshvardhan J. Pandit <me at harshp.com <mailto:me at harshp.com>> wrote:
>>> Hi Andrew, All.
>>> On 20/06/2019 01:37, Andrew Hughes wrote:
>>> > What I'm actually interested in is how ontologies generally are consumed 
>>> > and used. When I read this one, some items read as definitions, some as 
>>> > description, and some as pure pointers to other documents.
>>> > 
>>> > I would like to understand why this is and what the implications are for 
>>> > implementers.
>>> I think Andrew's questions show the need for more information on what 
>>> the DPV *is* and why it is structured the way it is. Since the DPVCG is 
>>> currently welcoming feedback and comments on the DPV, I'll note down 
>>> about writing a better introduction and adding in a section about 
>>> possible usage applications.
>>> BTW, the 'official' specification is at https://w3.org/ns/dpv <https://w3.org/ns/dpv> which is 
>>> IMO easier to go through than the RDF file.
>>> The DPV is not intended to be applicable to only a specific purpose or 
>>> application - it's usage can be quite broad. The aim is to provide a 
>>> common vocabulary regarding the processing of personal data.
>>> The Base Vocabulary defines top-level classes for describing how the 
>>> processing of data takes place i.e. what purpose, personal data, legal 
>>> basis etc. It is not mandatory for an adopter to use this specific model 
>>> - they can utilise other ways of expressing personal data handling as well.
>>> The other 'modules' such as Purpose, Personal Data, etc. provide 
>>> concepts relevant for a specific domain. For example, purpose defines 
>>> the top-level classification of purposes (for the processing of personal 
>>> data). One may wish to use only a particular module from the vocabulary. 
>>> In that respect, DPV is quite generic.
>>> The primary reason DPV is provided in RDF/OWL2 (semantic web), is the 
>>> shared semantics - which is quite important in expressing knowledge. For 
>>> example, in specifying that 'Research' is a purpose, with further 
>>> specialisations such as 'Commercial Research' and 'Academic Research'.
>>> Or an even better example - First Name, Pet Name, Common Name - all 
>>> being specific categories of a top-level category of Name. So when one 
>>> is processing 'Name' it means one can process all categories falling 
>>> under the 'Name' category. Combine this with properties, and one can 
>>> express all this in what seems to  be 'cool' way to call it - a 
>>> knowledge graph.
>>> Regards,
>>> -- 
>>> ---
>>> Harshvardhan Pandit
>>> PhD Researcher
>>> ADAPT Centre
>>> Trinity College Dublin
>>> -- 
>>> Andrew Hughes CISM CISSP 
>>> In Turn Information Management Consulting
>>> o  +1 650.209.7542 m +1 250.888.9474
>>> 1249 Palmer Road, Victoria, BC V8P 2H8
>>> AndrewHughes3000 at gmail.com <mailto:AndrewHughes3000 at gmail.com> 
>>> https://www.linkedin.com/in/andrew-hughes-682058a <https://www.linkedin.com/in/andrew-hughes-682058a>
>>> Digital Identity | International Standards | Information Security
>>> _______________________________________________
>>> WG-InfoSharing mailing list
>>> WG-InfoSharing at kantarainitiative.org <mailto:WG-InfoSharing at kantarainitiative.org>
>>> https://kantarainitiative.org/mailman/listinfo/wg-infosharing <https://kantarainitiative.org/mailman/listinfo/wg-infosharing>
> _______________________________________________
> WG-InfoSharing mailing list
> WG-InfoSharing at kantarainitiative.org
> https://kantarainitiative.org/mailman/listinfo/wg-infosharing

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://kantarainitiative.org/pipermail/wg-infosharing/attachments/20190622/016d0c1e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3862 bytes
Desc: not available
URL: <http://kantarainitiative.org/pipermail/wg-infosharing/attachments/20190622/016d0c1e/attachment-0001.p7s>

More information about the WG-InfoSharing mailing list