# 4. Website Footprinting

## <mark style="color:red;">1. Gather information with Ping</mark>

```
ping certifiedhacker.com 
```

Returns the IP address, TTL and round trip time.

### Finding maximum fragment size supported

```
ping 162.241.216.11 -f -l 1500
```

{% hint style="info" %}
-f do not fragment

-l specifies the size
{% endhint %}

If you get an error like this it means the packet size is not supported.

<figure><img src="/files/sCntwxtgr6TVk2XDO5cU" alt=""><figcaption></figcaption></figure>

Now try different sizes till the time we get hit and so we are able to find the maximum frame size supported on the machine.

<figure><img src="/files/x44iDmy6CFbafLkXirf5" alt=""><figcaption></figcaption></figure>

### Finding hops with TTL

Maximum hops supported are 255. -i flag sets TTL and -n flag tells the no of packets to be sent. Try different values of -i to get the number of hops.

```
ping 162.241.216.11 -i 14 -n 1
```

<figure><img src="/files/MDsi1wutqAVIW4JXyLxH" alt=""><figcaption></figcaption></figure>

### Other tools

Use tracert (windows) to find the number of hops

```
tracert 162.241.216.11
```

<figure><img src="/files/7H0PZTOkt0ljxjtZa306" alt=""><figcaption></figcaption></figure>

## <mark style="color:red;">2. Website footprinting with Photon</mark>

Incredibly fast crawler designed for OSINT.&#x20;

Photon can extract the following data while crawling:

* URLs (in-scope & out-of-scope)
* URLs with parameters (`example.com/gallery.php?id=2`)
* Intel (emails, social media accounts, amazon buckets etc.)
* Files (pdf, png, xml etc.)
* Secret keys (auth/API keys & hashes)
* JavaScript files & Endpoints present in them
* Strings matching custom regex pattern
* Subdomains & DNS related data

Crawling can be resource intensive but Photon has some tricks up it's sleeves. You can fetch URLs archived by [archive.org](https://archive.org/) to be used as seeds by using `--wayback` option.

{% embed url="<https://github.com/s0md3v/Photon>" %}

```
python3 photon -u https://certifiedhacker.com
```

results are saved in directory in the photon folder

**Extensive scan**

```
python3 photon -u https://certifiedhacker.com -l 3 -t 200 --wayback
```

* -u  url
* -l   scan levels
* -t   No of threads
* \--wayback   searches archive.org

## <mark style="color:red;">3.Gather information about target with central ops</mark>

{% embed url="<https://centralops.net/co/>" %}

<figure><img src="/files/ClQyDYpCcUI7QVXijKxn" alt=""><figcaption></figcaption></figure>

**Other tools**

{% embed url="<https://website.informer.com/>" %}

## <mark style="color:red;">4. Getting Information with web data extractors</mark>

Windows tool. Need to install

{% embed url="<http://www.webextractor.com/wde.htm>" %}

**Other tools**

{% embed url="<https://www.parsehub.com/>" %}

{% embed url="<https://www.kali.org/tools/spiderfoot/>" %}

{% embed url="<https://github.com/smicallef/spiderfoot>" %}

## <mark style="color:red;">5. Website Mirroring with HTTrack</mark>

Windows tool need to install

<https://www.httrack.com/>

&#x20;**Other tools**

{% embed url="<https://www.cyotek.com/cyotek-webcopy>" %}

## <mark style="color:red;">6. Website recon with Grecon</mark>

use google search for reconnaisance

{% embed url="<https://github.com/TebbaaX/GRecon>" %}

## <mark style="color:red;">7. Making wordlist with CEWL from website</mark>

```
cewl -w wordlist -d 2 -m 5 www.certifiedhacker.com
```

* -d depth
* -m mimimum word length
* -w wordlist file

### Best CEH Practical Preparation Course

{% embed url="<https://www.udemy.com/course/ethical-hacker-practical/?referralCode=289CF01CF51246BCAD6C>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ceh-practical.cavementech.com/module-2.-footprinting-and-reconnaissance/4.-website-footprinting.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
