<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<!DOCTYPE GmsArticle SYSTEM "http://www.egms.de/dtd/2.0.34/GmsArticle.dtd">
<GmsArticle xmlns:xlink="http://www.w3.org/1999/xlink">
  <MetaData>
    <Identifier>25gmds033</Identifier>
    <IdentifierDoi>10.3205/25gmds033</IdentifierDoi>
    <IdentifierUrn>urn:nbn:de:0183-25gmds0339</IdentifierUrn>
    <ArticleType>Meeting Abstract</ArticleType>
    <TitleGroup>
      <Title language="en">OverlapES: An R Package, and Accompanying R Shiny Application for Identifying and Quantifying Sample Overlap in Evidence Synthesis</Title>
    </TitleGroup>
    <CreatorList>
      <Creator>
        <PersonNames>
          <Lastname>Zhang</Lastname>
          <LastnameHeading>Zhang</LastnameHeading>
          <Firstname>Zhentian</Firstname>
          <Initials>Z</Initials>
        </PersonNames>
        <Address>
          <Affiliation>Department of Medical Statisitcs, University Medical Center G&#246;ttingen, G&#246;ttingen, Germany</Affiliation>
        </Address>
        <Creatorrole corresponding="no" presenting="no">author</Creatorrole>
      </Creator>
      <Creator>
        <PersonNames>
          <Lastname>Friede</Lastname>
          <LastnameHeading>Friede</LastnameHeading>
          <Firstname>Tim</Firstname>
          <Initials>T</Initials>
        </PersonNames>
        <Address>
          <Affiliation>Department of Medical Statisitcs, University Medical Center G&#246;ttingen, G&#246;ttingen, Germany</Affiliation>
        </Address>
        <Creatorrole corresponding="no" presenting="no">author</Creatorrole>
      </Creator>
      <Creator>
        <PersonNames>
          <Lastname>Mathes</Lastname>
          <LastnameHeading>Mathes</LastnameHeading>
          <Firstname>Tim</Firstname>
          <Initials>T</Initials>
        </PersonNames>
        <Address>
          <Affiliation>Department of Medical Statisitcs, University Medical Center G&#246;ttingen, G&#246;ttingen, Germany</Affiliation>
        </Address>
        <Creatorrole corresponding="no" presenting="no">author</Creatorrole>
      </Creator>
    </CreatorList>
    <PublisherList>
      <Publisher>
        <Corporation>
          <Corporatename>German Medical Science GMS Publishing House</Corporatename>
        </Corporation>
        <Address>D&#252;sseldorf</Address>
      </Publisher>
    </PublisherList>
    <SubjectGroup>
      <SubjectheadingDDB>610</SubjectheadingDDB>
      <Keyword language="en">Sample Overlap</Keyword>
      <Keyword language="en">Evidence Synthesis</Keyword>
      <Keyword language="en">R Package</Keyword>
      <Keyword language="en">R Shiny Application</Keyword>
    </SubjectGroup>
    <DatePublishedList>
      <DatePublished>20251103</DatePublished>
    </DatePublishedList>
    <Language>engl</Language>
    <License license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
      <AltText language="en">This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License.</AltText>
      <AltText language="de">Dieser Artikel ist ein Open-Access-Artikel und steht unter den Lizenzbedingungen der Creative Commons Attribution 4.0 License (Namensnennung).</AltText>
    </License>
    <SourceGroup>
      <Meeting>
        <MeetingId>M0631</MeetingId>
        <MeetingSequence>033</MeetingSequence>
        <MeetingCorporation>Deutsche Gesellschaft f&#252;r Medizinische Informatik, Biometrie und Epidemiologie</MeetingCorporation>
        <MeetingName>70. Jahrestagung der Deutschen Gesellschaft f&#252;r Medizinische Informatik, Biometrie und Epidemiologie e. V. (GMDS)</MeetingName>
        <MeetingTitle></MeetingTitle>
        <MeetingSession>V: Evidence synthesis, meta-analyses and meta-science</MeetingSession>
        <MeetingCity>Jena</MeetingCity>
        <MeetingDate>
          <DateFrom>20250907</DateFrom>
          <DateTo>20250911</DateTo>
        </MeetingDate>
      </Meeting>
    </SourceGroup>
    <ArticleNo>Abstr. 132</ArticleNo>
  </MetaData>
  <OrigData>
    <TextBlock name="Text" linked="yes">
      <MainHeadline>Text</MainHeadline><Pgraph><Mark1>Introduction:</Mark1> In medical research, evidence synthesis often involves the need for combing findings from multiple observational studies. A common challenge in this process is the potential overlap of samples across studies, especially when utilizing existing databases like registries <TextLink reference="1"></TextLink>. Such overlaps can bias meta-analysis results and undermine the credibility of its conclusions. Therefore, addressing sample overlap is crucial for improving the validity of synthesized evidence.</Pgraph><Pgraph><Mark1>State of the art:</Mark1>  Current methods for handling sample overlap are primarily ad-hoc solutions that rely on access to individual-level data or unique identifiers, which are frequently unavailable due to privacy concerns or data regulation policies <TextLink reference="2"></TextLink>. Some approaches correct the result of meta-analysis only in very specific cases <TextLink reference="3"></TextLink> or assume known overlap parts <TextLink reference="4"></TextLink>,  which have very limited applicability. Currently there are no practical tools for estimating sample overlap when only aggregate data is available.</Pgraph><Pgraph><Mark1>Concept:</Mark1> To narrow this gap, we developed overlapES, an R package, accompanied by a shiny web application. These tools implement a novel method that is grounded in set theory, enabling the inference of sample overlap by utilizing the ranges of selected study-sample characteristics that are commonly available, such as the location and the time of data generation, patient characteristics. This approach enables the identification of potential overlaps without requiring individual-level data.</Pgraph><Pgraph><Mark1>Implementation:</Mark1> The R package overlapES provides functions such as calculating the risks of overlap, visualizing the risks of overlap and finding the overlap-free set of studies with the largest sample size. The R shiny web application include similar functions, and additionally provides intuitive interfaces, allowing users to apply the method without extensive programming knowledge. We designed both tools to improve the accessibility of the methods, promoting broader adoption in the research community.</Pgraph><Pgraph><Mark1>Lessons learned:</Mark1> Applying overlapES to practical examples proved to be useful for detecting potential sample overlaps. The tools enable easy application of standardized solution to describe and estimate the extent of overlap between studies, and provide an intuitive way to address it. Further developments will focus on improving the robustness and flexibility of the algorithms, incorporating additional functions and expanding applicability to other research domains.</Pgraph><Pgraph>The authors declare that they have no competing interests.</Pgraph><Pgraph>The authors declare that an ethics committee vote is not required.</Pgraph></TextBlock>
    <References linked="yes">
      <Reference refNo="1">
        <RefAuthor>Mathes T</RefAuthor>
        <RefAuthor>Jacobs A</RefAuthor>
        <RefAuthor>Pieper D</RefAuthor>
        <RefTitle>Systematic reviews and meta-analyses that include registry-based studies: methodological challenges and areas for future research</RefTitle>
        <RefYear>2023</RefYear>
        <RefJournal>Journal of Clinical Epidemiology</RefJournal>
        <RefPage>119-122</RefPage>
        <RefTotal>Mathes T, Jacobs A, Pieper D. Systematic reviews and meta-analyses that include registry-based studies: methodological challenges and areas for future research. Journal of Clinical Epidemiology. 2023;156:119-122. DOI: 10.1016&#47;j.jclinepi.2023.02.014</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.1016&#47;j.jclinepi.2023.02.014</RefLink>
      </Reference>
      <Reference refNo="2">
        <RefAuthor>Hussein H</RefAuthor>
        <RefAuthor>Siddiqi K</RefAuthor>
        <RefAuthor>Hossain FN</RefAuthor>
        <RefAuthor>Sheikh A</RefAuthor>
        <RefTitle>Double-counting of populations in evidence synthesis in public health: a call for awareness and future methodological development</RefTitle>
        <RefYear>2022</RefYear>
        <RefJournal>BMC Public Health</RefJournal>
        <RefPage>1827</RefPage>
        <RefTotal>Hussein H, Siddiqi K, Hossain FN, Sheikh A. Double-counting of populations in evidence synthesis in public health: a call for awareness and future methodological development. BMC Public Health. 2022;22:1827. DOI: 10.1186&#47;s12889-022-14213-6</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.1186&#47;s12889-022-14213-6</RefLink>
      </Reference>
      <Reference refNo="3">
        <RefAuthor>Jin Q</RefAuthor>
        <RefAuthor>Shi G</RefAuthor>
        <RefTitle>Meta-analysis of SNP-environment interaction with overlapping data</RefTitle>
        <RefYear>2020</RefYear>
        <RefJournal>Frontiers in Genetics</RefJournal>
        <RefPage>1400</RefPage>
        <RefTotal>Jin Q, Shi G. Meta-analysis of SNP-environment interaction with overlapping data. Frontiers in Genetics. 2020;10:1400. DOI: 10.3389&#47;fgene.2019.01400</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.3389&#47;fgene.2019.01400</RefLink>
      </Reference>
      <Reference refNo="4">
        <RefAuthor>Lin DY</RefAuthor>
        <RefAuthor>Sullivan PF</RefAuthor>
        <RefTitle>Meta-analysis of genome-wide association studies with overlapping subjects</RefTitle>
        <RefYear>2009</RefYear>
        <RefJournal>American journal of Human Genetics</RefJournal>
        <RefPage>862-72</RefPage>
        <RefTotal>Lin DY, Sullivan PF. Meta-analysis of genome-wide association studies with overlapping subjects. American journal of Human Genetics. 2009;85(6):862-72. DOI: 10.1016&#47;j.ajhg.2009.11.001</RefTotal>
        <RefLink>http:&#47;&#47;dx.doi.org&#47;10.1016&#47;j.ajhg.2009.11.001</RefLink>
      </Reference>
    </References>
    <Media>
      <Tables>
        <NoOfTables>0</NoOfTables>
      </Tables>
      <Figures>
        <NoOfPictures>0</NoOfPictures>
      </Figures>
      <InlineFigures>
        <NoOfPictures>0</NoOfPictures>
      </InlineFigures>
      <Attachments>
        <NoOfAttachments>0</NoOfAttachments>
      </Attachments>
    </Media>
  </OrigData>
</GmsArticle>