🔐
SecWiki
  • Home
  • General
    • Interesting Links
      • Curriculum
    • Pentest Labs, Wargames Sites
      • How To Vulnhub with VirtualBox
  • Network Pentest
    • Courses
      • TCM - Zero to Hero
        • Week 1: Setup
          • ipsweep.sh
        • Week 2: Python 101
          • python101.py
          • bof.py
        • Week 3: Python 102
          • python102.py
          • scanner.py
        • Week 4: Passive OSINT
        • Week 5: Scanning Tools & Tactics
          • nmap
          • Nessus
          • msfconsole
        • Week 6: Enumeration
        • Week 7: Exploitation, Shells, and Some Credential Stuffing
        • Week 8: LLMNR/NBT-NS Poisoning
        • Week 9: NTLM
        • Week 10: MS17-010, GPP/cPasswords, and Kerberoasting
        • Week 11: File Transfers, Pivoting, Reporting
        • Commands
      • Penetration Testing Student (PTS)
      • OSCP Study
    • Recon
      • OSINT
    • Enumeration
      • Samba Shares
      • ProFtpd
    • Gaining Access
      • Reverse Shells
    • Privilege Escalation
      • Meterpreter
      • Spawning a TTY Shell
      • Reverse Shell Cheat Sheet
      • Cracking Hashes
      • Restricted Linux Shell Escape
      • Linux Privilege Escalation
        • lxd
        • sytemctl
      • Windows Privilege Escalation
        • Active Directory
          • What is AD?
        • User Enumeration
    • Post Exploitation
      • Cleanup
      • Maintaining Access
      • Pivoting
      • File Transfers
      • Covering Tracks
    • Vulnerabilities Checklist
    • Report Writing
  • Web App Pentest
    • Tools
      • Burp Suite
      • THC-Hydra BruteForce
    • Injection
      • SQL Injection
    • Broken Authentication
    • Sensitive Data Exposure
      • SQLite3
    • XML External Entity
      • XML Background
      • XPath Injection
    • Broken Access Control
    • Security Misconfiguration
    • Upload/Download
      • Download Bypass: Poison Null Byte
    • XSS
      • DOMXSS
      • Persistent XSS
      • Reflected (Client-side) XSS
      • Data URLs
    • Insecure Deserialization
    • Components with Known Vulnerabilities
    • Insufficient Logging and Monitoring
    • Server-Side Request Forgery (SSRF)
  • CTF
    • Intro to CTF
    • Forensics
      • Challenges
    • Steganography
    • Reverse Engineering
    • Tools
  • Network Security
    • Courses
      • Sec+
      • IBM Cybersecurity Analyst Professional Certificate
      • ISCI CNSS Course
        • Introduction to Network Security
          • Network Basics
          • Basic Network Utilities
          • The OSI Model
          • Threat Classification
          • Security Terminology
          • Approaches of Network Security
          • Law and Network Security
        • Types of Attacks
          • Denial of Service Attacks
          • Buffer Overflow Attacks
          • IP Spoofing
          • Session Hijacking
        • Fundamentals of Firewalls
          • What is a Firewall
          • Firewall Types
          • Firewall Implementation
          • Proxy Servers
          • Windows Firewalls
          • Linux Firewalls
        • Intrusion-Detection Systems
          • IDS Concepts
          • Components and Processes of IDS
          • Implementing IDS
          • Honeypots
        • Fundamentals of Encryption
          • The History of Encryption
          • Modern Encryption Methods
          • Windows and Linux Encryption
          • Hashing
          • Cracking Passwords
        • Virtual Private Networks (VPN)
          • Introduction to VPN
          • VPN Protocols
          • IPSec
          • SSL/TLS
          • VPN Solutions
        • Operating System Hardening
          • Configuring Windows
          • Configuring Linux
          • Operating System Patches
        • Virus Attacks and How to Defend
          • Virus Types and Attacks
          • Virus Scanners
          • Antivirus
          • Virus Infection and Identification
          • Trojan Horses
          • Spyware or Adware
        • Security Policies
          • User Policies Definition
          • System Administration Policies
          • Access Control
        • Assessing System Security
          • Risk Assessment
          • Conducting an Initial Assessment
          • Probing the Network
          • Vulnerabilities
          • Documenting Security
        • Security Standards
          • ISO Standards
          • NIST Standards
          • General Data Protection Regulation (GDPR)
          • PCI DSS
        • Physical Security and Recovery
          • Physical Security
          • Disaster Recovery
          • Fault Tolerance
        • Attackers Techniques
          • Hacking Preparation
          • The Attack Phase
          • Hacking Wi-Fi
    • The Web
    • The OSI Model
    • Malware Traffic Analysis with Wireshark
  • Digital Forensics
    • Autopsy - open-source digital forensics platform
  • Exploit Dev/Analysis
    • Code Review
      • Tools
    • Buffer Overflows
    • Static Analysis
      • Antivirus Scanning
      • Hashing
      • File strings
      • Packed and Obfuscated Malware
        • Demo: UPX
      • Portable Executable File Format (PE)
        • Tools
        • Linked Libraries and Functions
        • PE File Headers and Sections
  • Shell
    • ./missing-semester
      • Course overview + the shell
      • Shell Tools and Scripting
      • Editors (Vim)
      • Data Wrangling
      • Command-line Environment
    • Bash Tricks
    • .bashrc
    • Random Commands
      • sed
  • Hardware
    • NAND2Tetris
      • Boolean Functions and Gate Logic
      • Boolean Arithmetic and the ALU
      • Memory
      • Machine Language
      • Computer Architecture
      • Assembler
  • Other
    • K8s
      • Chapter 1: From Monolith to Microservices
      • Chapter 2: Container Orchestration
      • Chapter 3: Kubernetes
      • Chapter 4: Kubernetes Architecture
Powered by GitBook
On this page
  • eXtensible Markup Language
  • What is XML?
  • Why we use XML?
  • Syntax
  • Document Type Definition (DTD)

Was this helpful?

  1. Web App Pentest
  2. XML External Entity

XML Background

eXtensible Markup Language

What is XML?

XML (eXtensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is a markup language used for storing and transporting data.

Why we use XML?

1. XML is platform-independent and programming language independent, thus it can be used on any system and supports the technology change when that happens. 2. The data stored and transported using XML can be changed at any point in time without affecting the data presentation. 3. XML allows validation using DTD and Schema. This validation ensures that the XML document is free from any syntax error. 4. XML simplifies data sharing between various systems because of its platform-independent nature. XML data doesn’t require any conversion when transferred between different systems.

Syntax

Every XML document mostly starts with what is known as XML Prolog. <?xml version="1.0" encoding="UTF-8"?>

Above the line is called XML prolog and it specifies the XML version and the encoding used in the XML document. This line is not compulsory to use but it is considered a `good practice` to put that line in all your XML documents. Every XML document must contain a `ROOT` element. For example:

<?xml version="1.0" encoding="UTF-8"?> <mail> <to>falcon</to> <from>feast</from> <subject>About XXE</subject> <text>Teach about XXE</text> </mail>

In the above example the <mail> is the ROOT element of that document and <to>, <from>, <subject>, <text> are the children elements. If the XML document doesn't have any root element then it would be consideredwrong or invalid XML doc. Another thing to remember is that XML is a case sensitive language. If a tag starts like <to> then it has to end by </to> and not by something like </To>(notice the capitalization of T) Like HTML we can use attributes in XML too. The syntax for having attributes is also very similar to HTML. For example: <text category = "message">You need to learn about XXE</text>

In the above example category is the attribute name and message is the attribute value.

Document Type Definition (DTD)

Before we move on to start learning about XXE we'll have to understand what is DTD in XML.

DTD stands for Document Type Definition. A DTD defines the structure and the legal elements and attributes of an XML document.

Let us try to understand this with the help of an example. Say we have a file named note.dtd with the following content:

<!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> Now we can use this DTD to validate the information of some XML document and make sure that the XML file conforms to the rules of that DTD.

Ex: Below is given an XML document that uses note.dtd<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>falcon</to> <from>feast</from> <heading>hacking</heading> <body>XXE attack</body> </note>

So now let's understand how that DTD validates the XML. Here's what all those terms used in note.dtd mean

  • !DOCTYPE note - Defines a root element of the document named note

  • !ELEMENT note - Defines that the note element must contain the elements: "to, from, heading, body"

  • !ELEMENT to - Defines the to element to be of type "#PCDATA"

  • !ELEMENT from - Defines the from element to be of type "#PCDATA"

  • !ELEMENT heading - Defines the heading element to be of type "#PCDATA"

  • !ELEMENT body - Defines the body element to be of type "#PCDATA"

NOTE: #PCDATA means parseable character data.

  • How do you define a new ELEMENT?

    • !ELEMENT

  • How do you define a ROOT element?

    • !DOCTYPE

  • How do you define a new ENTITY?

    • !ENTITY

PreviousXML External EntityNextXPath Injection

Last updated 4 years ago

Was this helpful?