Web Collections FAQs

Frequently Asked Questions (FAQs)

Q: Why am I directed to the Internet Archive's 'Not in Archive' page for internal links within some web sites?

A: Some web pages may only have a portion of their web site archived by the Archive-It web crawler. Web content may be excluded from the archive either because of robot.txt exclusion requests, portions of the web site on hosted on a different web domain, or the program was not able to crawl the entire site.

Q: Why am I directed to a 'Not in Archive' page when trying to access photographs on archived Facebook pages?

A: All photographs on Facebook pages are excluded by robot.txt exclusion requests and are unable to be captured at this time by the Archive-It web crawler.

Q: Why can I not navigate below the older posts/more tabs on Facebook or Twitter pages?

A: At this time, the Archive-IT web crawler cannot display the archived portion of Facebook or Twitter pages below the older posts/more tab even though the content below the more tab is being archived.

Q: Why can I not access twitpics in archived web pages?

A: At this time, twitpics and links from Twitter posts are not able to be archived using the Archive-It web crawler.

Q: How do I know I am looking at an archived web page, and not the live web?

A: Some archived web sites contain internal links that are not captured by the Archive-It web crawler and will divert to the live web. Web sites archived by the University of Texas at San Antonio Special Collections will always have a yellow banner at the top of the page with following:

'You are viewing an archived web page, collected at the request of University of Texas, San Antonio using Archive-It. This page was captured on time/date, and is part of the University of Texas at San Antonio --- collection. The information on this web page may be out of date. See All versions of this archived page.'

Q: How do I remove my web page or photograph from the archive?

A: Web pages or web content currently in the archive are not able to be removed from web collections. If you would like for your web site or web domain not to be archived by the University of Texas at San Antonio Special Collections' web crawler in the future, please contact the Special Collections staff.

Q: How do I make sure that my web site is included in the archive?

A: If you are a member of the University of Texas at San Antonio or South Texas community and your web content is not currently being archived by the Internet Archive or the Archive-It Program, you can contact the Special Collections staff to inquire about having your web site included in UTSA web archive.

Q: How does UTSA Special Collections decide which web sites to archive?

A: The University of Texas at San Antonio Special Collections staff use a variety of different measures to determine whether or not a web site is appropriate for archiving, and the frequency at which particular web sites are crawled. The appraisal checklist currently considers 15 factors when considering including a web site for archiving: password protected (Y/N), organization/person, part of larger seed, crawled by the Wayback Machine, Wayback Machine start date, Wayback Machine frequency, robots.txt exclusions, web site subject, number of internal pages, content rating, frequency web content is updated, databases (Y/N), calendars (Y/N), last Wayback Machine crawl.

Q: Does Archive-IT capture the date and time when a web page is updated?

A: No. The Archive-It web crawler can only take a snapshot of a web site either before or after a web site has changed. Currently, there is no way to exactly determine when and how a particular web site modifies its content.